Robotics  Research 
Tbchnical  Report 


Geo, 


"^'afcv. 


'lurn 


omm. 


m^:>'^o, 


Implicitly  Representing  Arrangements 
of  Lines  or  Segments 

by 

Herbert  Edelsbrunner,  Leonidas  Guibas,  John  Herberger, 
Raimond  Seidel,  Micha  Sharir,  Jack  Snoeyink,  Emo  Welzl 


Technical  Report  No.  380 

Robotics  Report  No.  157 

June, 1988 


\'-^, 


New  York  University 
Eint  Institute  of  Mathematical  Sciences 

Computer  Science  Division 

25 1  Mercer  Street  New  York,  NX  WO] 2 


Implicitly  Representing  Arrangements 
of  Lines  or  Segments 

by 

Herbert  Edelsbrunner,  Leonidas  Guibas,  John  Herberger, 
Raimond  Seidel,  Micha  Sharir,  Jack  Snoeyink,  Emo  Welzl 


Technical  Report  No.  380 

Robotics  Report  No.  157 

June, 1988 


New  York  University 

Dept.  of  Computer  Science 

Courant  Institute  of  Mathematical  Sciences 

251  Mercer  Street 

New  York,  New  York   10012 


The  first  author  is  pleased  to  acknowledge  the  support  of  Amoco  Fnd.  Fac.  Dev.  Comput. 
Sci.  1-6-44862  and  National  Science  Foundation  Grant  CCR-8714565.  Work  on  this  paper  by 
the  fifth  author  has  ben  supported  by  Office  of  Naval  Research  Grant  N00014-87-K-0129,  by 
National  Science  Foundation  Grant  NSF-DCR-83-20085,  by  grants  from  the  Digital  Equip- 
ment Corporation,  and  the  IBM  Corporation,  and  by  a  research  grant  from  the  NCRD  -  the 
Israeli  National  Council  for  Research  and  Development.  The  sixth  author  was  supported  in 
part  by  a  National  Science  Foundation  Graduate  Fellowship. 


Implicitly  Representing  Arrangements  of  Lines  or  Segments* 

Herbert  Edelsbrunner',  Leonidas  Guibas^'"',  John  Hershberger"', 
Raimund  Seidel^-*,  Micha  Sharir^'^,  Jack  Snoeyink^,  and  Emo  Welzl* 

'University  of  Illinois  at  Urbana-Champaign,  'Stanford  University,  ^DEC  Systems  Research  Center, 
^IBM  Almaden  Research  Center,  ^University  of  California  at  Berkeley,  ®New  York  University, 

^Tel  Aviv  University,  ^Free  University  BerUn 


Abstract 

An  arrangement  of  n  hnes  (or  line  segments)  in  the  plane  is  the  partition  of  the  plane  defined 
by  these  objects.  Such  an  arrangement  consists  of  0{n^)  regions,  called  faces.  In  this  paper 
we  study  the  problem  of  calculating  and  storing  arrangements  implicitly,  using  subquadratic 
space  and  preprocessing,  so  that,  given  any  query  point  p,  we  can  calculate  efficiently  the  face 
containing  p.  First,  we  consider  the  case  of  hnes  and  show  that  with  A(rj)  space'  and  A(n^/') 
preprocessing  time,  we  can  answer  face  queries  in  \{y/n)+0{K)  time,  where  K  is  the  output  size. 
(The  query  time  is  achieved  with  high  probability.)  In  the  process,  we  solve  three  interesting 
subproblems:  1)  given  a  set  of  n  points,  find  a  straight-edge  spanning  tree  of  these  points  such 
that  any  Une  intersects  only  a  few  edges  of  the  tree,  2)  given  a  simple  polygonal  path  F,  form  a 
data  structure  from  which  we  can  find  the  convex  hull  of  any  subpath  of  F  quickly,  and  3)  given 
a  set  of  points,  organize  them  so  that  the  convex  hull  of  their  subset  lying  above  a  query  Une  can 
be  found  quickly.  Second,  using  random  sampling,  we  give  a  trade-off  between  increasing  space 
and  decreasing  query  time.  Third,  we  extend  our  structure  to  report  faces  in  an  arrangement 
of  line  segments  in  A(n'/^)  +  0(K)  time,  given  A(n^/^)  space  and  A(n*/^)  preprocessing  time. 
Lastly,  we  note  that  our  techniques  allow  us  to  compute  m  faces  in  an  arrangement  of  n  lines 
in  time  A(m'/^n'/^  +  n),  which  is  nearly  optimal. 

1     Introduction 

Suppose  we  are  given  a  set  L  of  n  straight  lines  in  the  plane.  These  lines  partition  the  plane  into  a 
set  of  convex  regions  called  faces.  The  collection  of  all  these  faces  is  referred  to  as  the  arrangement 
of  L,  and  is  denoted  by  A{L).  Now  suppose  we  are  adlowed  to  perform  some  preprocessing  on  L  so 
that  the  following  type  of  query  can  be  answered  readily:  For  a  given  query  point  p,  which  face  in 
A{L)  contains  pi 
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What  do  we  mean  by  "which  face"?  We  want  a  complete  description  of  the  face  containing  p 
in  terms  of  the  lines  in  I,  that  is,  which  lines  of  L  appear  along  the  boundary  of  that  face  in,  say, 
counterclockwise  order.  In  some  cases  we  may  be  satisfied  with  a  simpler  property  of  the  face,  such 
as  its  intersection  with  a  line,  its  leftmost  vertex,  the  edge  below  p,  or  the  number  of  lines  on  the 
face  boundary. 

One  direct  approach  to  this  problem  is  to  precompute  complete  descriptions  of  all  the  faces 
in  A{L).  Given  a  query  point  p,  we  then  just  need  to  locate  p  in  the  face  containing  it;  the 
precomputed  description  of  that  face  can  be  used  to  answer  the  query  quickly.  The  complete 
descriptions  of  aJl  the  faces  in  A{L)  can  be  computed  in  0{n^)  time  [CGL85,  EOS86];  determining 
which  face  of  A{L)  contains  p  is  an  instance  of  planar  point  location,  which  can  be  solved  in 
O(logn)  time  [Kir83,  EGS86,  ST86]  after  Oi^ri^)  preprocessing.  Thus,  after  ©(n^)  preprocessing, 
any  face-reporting  query  can  be  answered  in  0(logn  +  K)  time,  where  K  is  the  "output  size," 
namely  the  number  of  lines  that  bound  the  desired  face.  The  simpler  queries  listed  above  can  be 
answered  in  O(logn)  time. 

But  what  if  it  is  deemed  excessive  to  use  O(n^)  time  for  preprocessing,  or  if  the  0(n^)  space 
that  the  above  method  requires  is  not  available?  What  if  the  space  at  our  disposal  is  only  about 

linear  in  n? 

Again,  there  is  a  straightforward  solution:  don't  do  any  preprocessing  at  all;  simply  use  any  set 
representation  as  the  data  structure  for  I.  When  pven  a  query  point  p,  first  determine  for  each 
line  t  £  L  which  of  the  two  halfplanes  bounded  by  I  contains  p,  and  then  construct  the  intersection 
of  these  halfplanes.  This  takes  0(n  log  n)  time  using  divide-and-conquer  [SH76].  If  the  lines  in  L 
are  initially  sorted  by  slope,  then  the  intersection  can  be  computed  in  0{n)  time.  This  query  time 
compares  poorly  with  the  0(log  n)  query  time  achieved  by  the  first  method. 

In  this  paper  we  address  the  question  of  whether  we  can  answer  face  queries  in  sublinear  time 
using  only  roughly  linear  space.  We  show  that  it  is  possible  to  preprocess  a  set  Z,  of  n  lines  in 
randomized  time  0(n^''^  log^  n),  producing  a  data  structure  that  uses  0(n  log  n)  space,  so  that, 
with  high  probability,  any  face- reporting  query  can  be  answered  in  time  0{y/n\og^  n  -|-  K).  The 
simpler  queries  mentioned  above  take  0{\/n\o^  n)  time.  (By  an  alternative  approach  that  uses 
more  precomputation,  we  can  reduce  the  space  to  0{n)  and  guarantee  0{y/n\o^  n  ■\-  K)  query 
time.)  We  also  present  a  similar  result  for  the  more  general  case  when  the  set  L  consists  of  line 
segments,  instead  of  (infinite)  lines. 

Our  solution  proceeds  roughly  as  follows:  First,  using  duality,  we  transform  the  problem  into 
a  related  one  about  data  points  and  query  lines. ^  In  this  dual  setting,  our  problem  becomes  that 
of  preprocessing  a  set  P  of  points  in  the  plane  so  that,  for  any  query  line  £,  one  can  determine 
quickly  the  convex  hull  of  the  points  in  P  lying  above  t,  and  the  convex  hull  of  the  points  in  P 
lying  below  £.  To  achieve  this,  we  construct  a  family  of  spanning  trees  for  P  with  the  property  that 
any  straight  line  intersects  only  few  edges  of  one  of  these  trees.  This  means  that  any  straight  line 
"cuts"  such  a  tree  into  a  small  number  of  pieces,  each  of  which  lies  completely  on  one  side  of  the 
line.  Using  an  appropriate  data  structure  based  on  the  spanning  tree,  it  is  possible  to  derive  an 
implicit  representation  of  the  convex  hulls  of  the  individual  tree  pieces  quickly,  and  these  individual 
hulls  can  then  be  combined  to  give  the  two  desired  convex  hulls. 

To  ensure  the  output-size  sensitivity  of  our  algorithm,  it  is  imperative  that  we  do  not  construct 
the  convex  huUs  of  the  tree  pieces  explicitly,  because  their  total  size  might  be  larger  than  the  size 


'We  assume  the  reader  is  familiar  with  the  concept  of  point-line  duality  in  the  plane.     Otherwise  consult 
Brown  [Bro80]  or  Edelsbrunner  [Ede87,  pp.  12-14]. 


of  the  combined  hull.  We  achieve  this  using  a  hierarchical  representation  of  convex  hulls  that  is 
described  in  Section  4. 

Our  use  of  spanning  trees  is  closely  related  to  a  recent  technique  of  Welzl  [Wel88]  for  im- 
proved halfplane  range  searching.  However,  in  our  approach  we  face  several  additional  interesting 
subproblems;  for  example,  the  initial  form  of  the  spanning  trees  that  we  produce  can  be  highly 
self-intersecting,  which  is  unsuitable  for  our  application.  We  develop  an  efficient  procedure  for  "un- 
tangling" such  a  tree  while  increasing  the  maximum  number  of  cuts  of  that  tree  by  a  line  by  at  most 
a  factor  of  two.  The  procedure  uses  the  shortest  path  algorithm  for  simple  polygons  of  [GH87]. 

We  next  consider  the  case  of  line  segments,  which  is  more  complicated  because  of  the  highly 
irregular  shape  of  the  faces  in  an  arrangement  of  segments.  We  show  that  by  using  A{n^^^)  ran- 
domized preprocessing  and  A(n^'^)  space,  we  can  obtain  the  face  containing  a  query  point  in  time 
A{n^^^)  +  0{K),  where  K  is  the  size  of  the  face.  Note  that  in  this  case  we  have  decreased  the  query 
time  at  the  cost  of  using  more  preprocessing  and  space.  We  show  that  for  the  case  of  lines,  this  is 
an  instance  of  a  generally  applicable  trade-off. 

We  believe  that  our  results  will  have  many  applications  in  speeding  up  algorithms  that  manip- 
ulate arrangements  of  lines  or  of  segments  in  the  plane.  One  application,  given  in  this  paper,  is  the 
computation  of  m  distinct  faces  in  an  arrangement  of  n  lines  in  time  \{m^l^n?l^  4-  n),  which,  by 
the  results  of  [CEG*88],  is  almost  optimal.  Another  recent  application  is  the  calculation  of  many 
components  in  an  arrangement  of  n  (intersecting)  triangles  in  3-space  [AS88]. 

The  organization  of  this  paper  is  as  follows:  Section  2  describes  some  of  the  notation  we  will 
use.  Section  3  ^ves  the  spanning  tree  construction  for  the  points  dual  to  the  lines  in  L.  Section  4 
uses  the  spanning  tree  to  find  the  convex  hiills  of  points  on  either  side  of  a  query  line,  and  then 
Section  5  applies  this  result  to  solve  the  original  face- reporting  problem  on  A{L).  Section  6  discusses 
a  general  trade-off  between  space  and  query  time,  Section  7  uses  random  sampling  to  obtain  an 
efficient  implicit  representation  for  arrangements  of  segments,  and  Section  8  presents  the  application 
of  calculating  many  faces  at  once. 

2      Geometric  Preliminaries 

We  must  make  a  few  definitions  before  we  begin.  The  convex  hull  of  a  set  S  of  points  is  denoted 
by  h{S).  Given  a  (non-vertical)  line  I,  let  t"^  be  the  closed  halfspace  above  the  line  and  let  l~  be 
the  closed  halfspace  below  the  line.  We  occasionally  use  C  to  refer  to  either  t"^  or  t~ . 

In  this  paper,  we  will  introduce  "big-A"  notation  to  suppress  logJirithmic  and  smaller  factors. 
Specifically,  if  /  and  g  are  functions  of  n,  we  write  /(n)  =  An(p(n))  if,  for  some  constant  k,  we 
have  /(n)  =  0(5(n)log*  n).  Because  this  paper  uses  only  An  (no  other  subscripts),  we  will  simply 
write  A  in  place  of  A„  in  what  follows. 

Let  L  =  {ii,t2i-  ■  ■  ,in}  be  the  set  of  n  lines  whose  arrangement,  A{L),  we  wish  to  store.  Given 
a  query  point  p,  the  face  containing  p  is  bounded  by  the  lines  that  p  can  "travel  to"  without 
crossing  lines  of  L.  We  can  map  the  set  of  lines  L  to  a  set  of  points  P  =  {pi,P2i--  -iPn]  by 
a  duality  transform  that  preserves  the  above/below  relationship  between  points  and  lines.  (See 
Edelsbrunner  for  a  precise  formulation  of  the  duality  [Ede87].)  A  primal  query  point  p  becomes  a 
dual  non-vertical  query  line  i.  The  lines  of  L  bounding  the  face  containing  p  now  become  points  of 
P  that  the  line  i  can  "travel  to"  by  translation  and  rotation  without  crossing  any  other  point.  In 
other  words,  the  dual  of  the  face  consists  of  the  portions  of  the  convex  hulls  h{Pr\t^)  and  h{Pr]i~) 
between  their  inner  common  tangents,  as  depicted  in  Figure  1.  Thus  our  problem  becomes  one  of 


finding  quickly  the  convex  hulls  of  the  points  of  P  lying  either  above  or  below  a  query  line. 


Figure  1:  The  dual  of  a  face 


3     Computing  trees  with  low  stabbing  number 

In  this  section,  we  define  the  stabbing  number  of  a  spanning  tree  on  a  configuration  of  points  in  the 
plane.  We  give  three  algorithms  for  computing  trees  with  low  stabbing  number;  these  methods  have 
trade-offs  between  space,  preprocessing  time,  the  stabbing  number,  and  the  use  of  randomization. 

We  begin  with  some  definitions.  Any  line  £  that  does  not  intersect  P  partitions  the  points  of  P 
into  two  sets — £■*"  n  P  and  £~  n  P.  Two  lines  are  equivalent  with  respect  to  P  if  they  induce  the  same 
partition.  We  will  frequently  use  £  to  denote  a  set  of  lines  such  that  every  line  that  partitions  P 
into  two  non-empty  sets  is  equivalent  to  exactly  one  line  of  C.  We  can  establish  a  correspondence 
between  the  lines  of  C  and  pairs  of  points  of  P:  If  a  line  (  £  C  is  rotated  counterclockwise  as  far 
as  possible  without  crossing  points  of  P,  it  hits  a  unique  pair  of  points,  the  left  one  from  above 
and  the  right  one  from  below.  If  the  line  determined  by  a  pair  of  points  of  P  is  rotated  slightly 
clockwise,  so  that  it  passes  above  the  left  point  and  below  the  right  one,  it  is  equivalent  to  a  unique 
line  of  C.  Therefore  there  are  Q  lines  in  C. 

Let  T  be  a  (straight-edge)  spanning  tree  on  P.  The  stabbing  number  of  a  line  £  with  respect 
to  the  tree  T  is  the  number  of  edges  of  T  cut  by  £,  that  is,  the  number  of  edges  whose  endpoints 
come  from  different  sets  of  the  partition  induced  by  £.  The  stabbing  number  of  T  is  the  maximum 
of  the  stabbing  numbers  of  all  lines  with  respect  to  T.  We  can  prove  the  following  lower  bound: 

Lemma  3.1    There  is  a  configuration  ofn  points  for  which  every  spanning  tree  has  stabbing  number 

Proof:  Let  P  be  the  configuration  of  size  n  =  m^  given  by  the  points  of  an  m  x  m  grid. 
The  arrangement  of  2m  —  2  lines  between  the  rows  and  columns  of  the  grid  isolates  each 
point  in  a  face;  thus,  each  edge  of  the  spanning  tree  must  intersect  one  of  these  lines.  The 
average  number  of  edges  intersected  by  a  line  is  (n  -  l)/(2m  -  2)  >  m/2  -  1.  So  ft(v/")  is 
a  lower  bound  for  the  stabbing  number  of  this  configuration  of  n  points.      I 


We  have  several  methods  for  computing  trees  with  stabbing  number  close  to  this  lower  bound. 
The  first  method  gives  a  slow  deterministic  construction  of  a  tree  with  optimal  stabbing  number 
0{y/n)\  the  second  uses  a  conjugation  partition  tree,  which  can  be  built  quickly,  to  make  a  spanning 
tree  with  stabbing  number  0(n°^^5)  [EW86].  These  methods  are  described  in  Section  3.1.  The 
third  method,  described  in  Section  3.2,  employs  random  sampling  to  find  a  tree  such  that,  for  any 
line  I  from  the  set  of  non-equivalent  lines  £,  the  stabbing  number  of  I  is  \{y/n)  with  probability 
greater  than  one-half.  We  can  then  produce  a  family  of  O(logn)  trees  such  that,  with  probability 
arbitrarily  close  to  1,  any  line  I  has  stabbing  number  h.{y/n)  with  respect  to  one  of  the  trees. 
Our  method  is  based  on  drawing  random  samples  from  an  implicitly  specified  set  of  lines;  this  is 
described  in  Section  3.3. 

The  first  and  third  algorithms  produce  trees  that  may  intersect  themselves  when  embedded 
in  the  plane,  so  we  also  discuss  how  to  untangle  spanning  trees  without  significantly  increasing 
their  stabbing  number.  In  particular,  in  Section  3.4  we  provide  an  algorithm  that  will,  for  the 
randomized  construction,  remove  all  self-intersections  and  at  most  double  the  stabbing  number  of 
each  line. 

3.1     Preliminary  approaches 

This  section  describes  two  deterministic  algorithms  for  building  spanning  trees  with  low  stabbing 
number.  Roughly  speaJcing,  these  algorithm  are  at  opposite  ends  of  the  trade-off  between  stabbing 
number  and  preprocessing  time  and  space.  In  Section  3.2  we  will  see  how  introducing  randomization 
lets  us  get  close  to  the  optimal  stabbing  number  without  incurring  large  preprocessing  costs. 

Theorem  3.2  For  a  configuration  of  n  points  P,  we  can  compute  a  spanning  tree  T  with  stabbing 
number  0{y/n)  by  a  procedure  that  can  be  implemented  to  run  in  0(n^  log  n)  time  and  0{n)  space. 

Proof:  Welzl  [Wel88]  has  proved,  by  an  e-net  argument,  that  the  procedure  described  in 
the  following  two  paragraphs  computes  a  spanning  tree  with  stabbing  number  0{y/nlogn). 
Cha^eUe  [ChaSS]  has  improved  the  proof  to  show  that  the  stabbing  number  obtained  is 
actually  0{y/n). 

Initially  let  T  be  the  graph  with  the  vertices  of  P  and  no  edges.  Let  £  be  a  set  of  Q) 
lines  such  that  every  line  that  partitions  P  is  equivalent  to  exactly  one  line  of  C  We  will 
be  duplicating  these  lines  in  the  algorithm,  so  here  it  would  be  more  accurate  to  call  C  a 
multiset.  A  pair  of  points  p,q  £  P  are  separated  by  some  subset  of  C — let  £p,,  C  >C  be  the 
multiset  of  lines  that  partition  p  and  q  into  separate  sets. 

Now  repeat  the  following  steps.  Find  a  pair  p,q  €  P  that  minimizes  \Cp^g\  and  add  the 
edge  {p,q)  to  the  spanning  tree  T.  Next,  duplicate  all  the  lines  of  £p,,  by  the  assignment 
C  :=  CUCp,q.  Then  discard  the  point  q  and  repeat  the  whole  process.  After  n—  1  repetitions, 
T  will  be  a  spanning  tree  with  stabbing  number  Oi^/n)  (see  [Cha88]). 

This  procedure  can  be  implemented  to  run  in  0(n^  log  n)  time  and  linear  space  by 
considering  the  dual  problem.  The  points  of  P  dualize  to  lines,  edges  {p,q)  to  double 
wedges,  and  lines  of  C  to  points  in  cells  of  the  arrangement  of  dual  lines.  Each  dual  point 
has  a  weight,  and  the  weight  of  a  double  wedge  is  the  sum  of  the  weights  of  the  points  it 
contains.  An  edge- finding  step  of  the  algorithm  involves  determining  the  wedge  of  minimum 
weight  and  doubling  the  weight  of  its  points.  One  edge-finding  step  can  be  accomplished  in 
O(n^logn)  time  and  linear  space  by  a  sweepline  algorithm. 


Our  algorithm  is  a  two-pass  algorithm;  each  pass  sweeps  a  vertical  line  from  left  to  right 
over  the  arrangement  of  dual  lines.  The  first  pass  computes,  for  each  line,  the  total  weight 
of  the  dual  points  below  the  line.  The  second  pass  enumerates  the  (j)  intersection  points 
of  the  dual  lines  in  left-to- right  order.  Each  intersection  point  corresponds  to  both  a  double 
wedge  and  a  point  that  is  the  dual  of  a  line  in  L.  At  each  intersection  point,  the  algorithm 
computes  the  weight  of  the  double  wedge  associated  with  the  intersection;  over  the  course 
of  the  sweep,  the  algorithm  computes  the  double  wedge  of  minimum  weight. 

Both  sweeps  of  the  algorithm  perform  the  same  underlying  computation.  Each  sweep 
maintains,  for  each  line,  the  total  weight  of  all  the  dual  points  below  the  line  and  to  the  left 
of  the  sweepline.  This  is  all  the  first  pass  needs  to  do  to  compute  the  total  weight  below 
each  line.  The  second  pass  uses  the  same  information  to  compute  the  weights  of  double 
wedges.  Our  duality  transform  guarantees  that  each  double  wedge  is  composed  of  left  and 
right  single  wedges,  rather  than  top  and  bottom  wedges.  When  the  sweepline  reaches  an 
intersection  of  two  lines,  the  algorithm  subtracts  the  current  weight  totals  for  the  lines  to 
get  the  weight  of  the  single  wedge  to  the  left  of  the  intersection.  By  subtracting  the  current 
weights  from  the  total  weights  for  the  two  lines,  the  algorithm  computes  the  weight  of  the 
right  wedge  as  well.  The  weight  of  the  whole  double  wedge  is  the  sum  of  the  weights  of  the 
left  and  right  wedges. 

We  now  describe  the  two  key  ideas  of  the  sweepline  algorithm.  First,  we  describe  the 
data  structures  used  to  maintain  the  total  weight  of  the  points  below  each  line  and  left  of 
the  sweepline.  Second,  we  tell  how  to  implicitly  represent  the  0(n^)  dual  points  and  their 
weights  in  0{n)  space. 

To  maintain  the  weight  for  each  line,  the  algorithm  associates  with  the  sweepline  a 
complete  binary  tree  whose  n  leaves  correspond  to  the  lines  in  the  order  they  intersect  the 
sweepline.  Each  node  of  the  tree  has  an  associated  weight;  the  algorithm  maintains  the 
following  invariant:  tht  weight  of  the  dual  points  below  a  line  I  and  left  of  the  sweepline 
is  equal  to  the  sum  of  the  node  weights  on  the  path  from  the  root  of  the  tree  to  the  leaf 
corresponding  to  £.  The  node  weights  are  initialized  to  zero  when  the  sweepline  is  at  negative 
infinity.  The  algorithm  must  change  the  node  weights  when  the  sweepline  crosses  either  a 
dual  point  or  an  intersection  of  dual  lines.  When  the  sweepline  crosses  a  dual  point,  the 
algorithm  preserves  the  invariant  by  adding  the  weight  of  the  point  to  the  weights  of  the 
roots  of  disjoint  majcimal  subtrees  whose  leaves  correspond  to  the  lines  above  the  dual  point. 
There  are  O(logn)  such  nodes — at  most  two  per  level  of  the  tree — so  this  takes  O(logn) 
time.  When  two  lines  exchange  places  on  the  sweepline,  the  weights  of  their  corresponding 
leaves  are  adjusted  in  O(logn)  time  to  preserve  the  invariant:  the  difference  between  the 
two  line  weights  is  added  to  or  subtracted  from  each  leaf  weight. 

The  algorithm  cannot  afford  to  store  the  points  and  their  weights  explicitly.  It  is  not 
hard  to  compute  the  points  during  the  sweep,  however,  since  each  is  associated  with  a  vertex 
of  the  line  arrangement.  The  weight  of  a  point  is  2*,  where  k  is  the  number  of  the  double 
wedges  chosen  so  far  that  contain  the  point.  The  algorithm  maintains,  for  each  face  of 
the  arrangement  cut  by  the  sweepline,  the  number  of  previously  chosen  double  wedges  that 
contain  the  face.  This  allows  the  algorithm  to  find  the  weight  of  each  point  in  constant  time, 
but  does  not  increase  the  asymptotic  complexity  of  the  sweep. 


The  sweepline  algorithm  takes  0{n'^\ogn)  time  and  linear  space:  Maintaining  the 
sweepline  takes  O(logn)  time  per  vertex  of  the  arrangement.  Computing  the  dual  points 
and  their  weights  takes  constant  additional  time  per  point.  Updating  the  tree  when  the 
sweepline  crosses  a  point  or  a  vertex  of  the  arrangement  takes  O(logn)  time.  Determining 
the  weight  of  a  double  wedge  also  takes  O(logn)  time,  since  it  involves  summing  the  O(log  n) 
weights  along  a  path  in  the  tree.      | 

Note  that  the  0(log  n)-time  tree  update  and  query  operations  dominate  the  running  time  of 
the  sweepline  algorithm  given  above:  therefore,  using  a  topological  sweep  [EG86]  to  maintain  the 
sweepline  would  not  decrease  the  O(n"'logn)  running  time.  It  is  possible  to  decrease  the  running 
time  to  0{n^)  if  we  use  0(n^)  space,  but  we  will  not  give  the  details  of  that  method  here. 

The  spanning  tree  produced  by  the  method  of  Theorem  3.2  may  intersect  itself.  Section  3.4  tells 
how  to  transform  the  tree  into  one  without  self-intersections  while  at  most  doubling  its  stabbing 
number. 

For  our  purpose,  the  time  bound  of  the  algorithm  of  Theorem  3.2  is  excessive;  given  O(n') 
time  and  space,  we  coxild  explicitly  construct  the  arrangement  of  lines  and  use  it  to  answer  queries. 
However,  if  we  can  accept  a  larger  stabbing  number,  then  we  can  apply  a  second  deterministic 
technique  to  find  a  spcinning  tree  from  a  partition  tree  using  much  less  preprocessing  time. 

Theorem  3.3  For  a  configuration  of  n  points  P,  we  can  compute  a  non-self-intersecting  spanning 
tree  T  with  stabbing  number  0{n°-^^^)  by  a  procedure  that  uses  O(nlogn)  time  and  linear  space. 

Proof:  Partition  trees,  such  as  Willard  trees  [Wil82]  or  the  conjugation  trees  of  Edelsbrun- 
ner  and  VVelzl  [EW86],  recursively  divide  a  set  of  points  for  efficient  processing  of  halfplane 
range  queries.  Each  node  of  the  tree  is  associated  with  a  subset  of  the  points.  The  impor- 
tant property  of  partition  trees  is  that,  for  any  line  t,  at  most  0{n°)  nodes  contain  sets 
of  points  that  are  separated  by  £.  The  conjugation  tree,  for  example,  is  a  binary  tree  with 
Q  =  log2((l  -f  \/5)/2);  it  can  be  built  deterministically  in  0(n log n)  time. 

We  construct  a  graph  with  stabbing  number  0{n°)  from  a  conjugation  tree  by  computing 
the  convex  hull  of  the  set  of  points  at  each  node.  Using  a  divide-and-conquer  strategy,  we 
merge  the  convex  hulls  of  the  two  children  of  a  node  by  finding  two  outer  common  tangents. 
(In  a  conjugation  tree,  the  children  of  a  node  are  separable  by  a  line;  this  guarantees  that 
their  convex  hulls  have  exactly  two  common  tangents.)  Two  huUs  with  m  points  can  easily 
be  merged  in  0{m)  time,  which  gives  an  O(nlogn)  bound  for  the  whole  process. 

This  divide-and-conquer  procedure  terminates  with  a  plajiar  embedding  of  a  connected 
graph  that  has  at  most  two  edges  for  each  node  of  the  conjugation  tree.  Compute  a  spanning 
tree  T  on  this  graph.  Any  line  t  separates  points  in  at  most  0{n'')  nodes  of  the  conjugation 
tree,  and  thus  T  has  stabbing  number  0{n°)  -  0{n°-^^^).      I 

3.2     Randomized  algorithms 

To  obtain  either  fast  preprocessing  or  optimal  stabbing  number,  the  deterministic  algorithms  of 
the  preceding  subsection  sacrifice  the  other.  If  we  use  randomization,  we  can  in  0(n  log  n)  time 
compute  a  tree  T  with  an  average  stabbing  number  nearly  as  good  as  that  of  Theorem  3.2.  More 
precisely: 


Theorem  3.4  Given  a  configuration  of  n  points  P,  a  random  sample  with  size  (1  +  y/2)y/n  drawn 
from  a  set  of  lines  C,  and  a  constant  a  >  0,  we  can  construct  a  spanning  tree  T  on  P  such  that, 
with  probability  1  -  n~°'  for  any  a'  <  a,  the  total  number  of  intersections  between  C  and  T  is 
Ai\£\y/n),  where  the  constant  of  proportionality  also  includes  a  as  a  factor.  Thus,  the  stabbing 
number  of  the  lines  of  C  is  A{y/n)  on  the  average.  The  construction  takes  0{n\ogn)  time. 

Proof:  In  order  to  minimize  the  total  number  of  line-edge  intersections  as  we  build  our 
tree,  we  employ  random  sampling  techniques  due  to  Clarkson  [Cla87],  which  are  closely 
related  to  the  €-nets  used  by  Welzl  [Wel88,  HW87].  The  theory  of  c-nets  says  that,  in  a 
range  space  of  finite  Vapnik-Chervonenkis  dimension  d,  a  random  sample  of  size  r  drawn 
from  a  space  of  n  elements  is  an  €-net  with  probability  1  —  ^,  if  the  sample  size  r  is  at  least 
max((4/t)log(2/^),(8d/€)log(8d/f)).  In  our  case  the  ranges  will  be  triangles  or  lines  and 
our  6  =  n'".  Thus  we  can  compute  a  constant  c  such  that,  with  probability  1  —  ^,  the 
random  sample  is  an  e-net  for  e  =  (calogn)/r. 

Our  algorithm  constructs  the  tree  T  in  at  most  logn  phases.  It  uses  part  of  the  random 
sample  of  lines  from  C  in  each  phase;  the  unused  sample  lines  are  stored  in  a  list,  which 
initially  contains  the  whole  random  sample.  The  algorithm  maintains  a  current  set  of  points, 
which  it  initializes  to  P;  each  phase  discards  at  least  half  of  the  current  set.  If  a  phase  begins 
with  j  points — call  it  phase  j — we  do  the  following.  Obtain  a  random  sample  R  of  the  lines 
of  £  by  removing  the  first  •y/j72  lines  from  the  list.  Construct  the  arrangement  A{R) 
in  time  linear  in  j  [EOS86,  CGL85].  Triangulate  each  of  the  faces  of  this  arrangement  to 
form  a  subdivision  of  the  plane,  which  we  denote  T{R).  A  short  calculation  shows  that  the 
number  of  triangular  faces  in  T{R)  is  at  most  j/2  for  _;'  >  2.  Place  the  j  points  into  the 
triangles  that  contain  them  by  performing  a  point  location  on  each  point.  After  0{j)  initial 
preprocessing,  each  of  these  j  point  locations  takes  logarithmic  time  [EGS86,  Kir83,  ST86]. 
Finally,  within  each  triangle,  form  a  (non-self-intersecting)  spanning  tree  on  the  points  it 
contains,  choose  a  root  vertex,  and  discard  all  points  except  the  root.  This  leaves  at  most 
j/2  points.  (If  j  =  2,  we  do  not  bother  to  pick  R,  but  just  form  a  single-edge  spanning  tree 
on  the  two  points.) 

Each  phase  gets  an  independent  random  sample  from  the  lines  of  C,  but  for  convenience 
we  bundle  them  into  the  single  sample  specified  in  the  statement  of  the  lemma.  The  total 
size  of  the  random  sample  we  need  from  C,  summed  over  all  phases,  is  at  most  X^,>o  \/"/2'  = 
(1  +  V2)^. 

We  must  do  three  things:  prove  that  the  algorithm  gives  a  spanning  tree,  show  that 
this  tree  has  the  right  average  stabbing  number,  and  determine  the  time  complexity  of  the 
algorithm. 

We  stop  the  algorithm  when  j  =  1,  so  the  last  phase  finds  a  spanning  tree.  Phase  j  com- 
putes at  most  j/2  rooted  spanning  trees  and,  by  induction,  the  remainder  of  the  algorithm 
computes  a  spanning  tree  on  their  roots.  Thus,  the  graph  returned  is  connected.  Since  n  —  1 
edges  are  used  altogether,  the  result  is  a  spanning  tree. 

As  mentioned  above,  there  is  a  constant  cj  (which  here  includes  the  multiplicative  factor 
of  a)  such  that,  with  probability  1  -  n~°,  our  sample  R  is  an  c-net  of  size  f  =  Ci  log  n/v7 
for  triangular  ranges.  This  means  that  for  any  triangle  r  in  'r{R),  at  most  Ci|£|logn/v/J 
of  the  lines  of  the  set  C  intersect  r.  Therefore,  each  edge  we  add  contributes  at  most  this 
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number  of  line-edge  incidences  to  the  total.  Summing  over  all  log  n  phases,  the  total  number 
of  edge-line  incidences  is,  with  probability  (1  -  n"")'"*"  >  1  -  n-'^logn,  at  most 

E      ^iKI^     <    C:|£|logn   X:    4 

for  each  edge  VJ  l<»<n  ^» 

in  each  phase  j 

=    2ci|£|v/;?logn(l  +  0(l/v/^)). 

Thus,  the  average  stabbing  number  over  the  lines  of  £  is  A(v/n)- 

The  time  required  by  the  j  point  locations  at  O(logn)  time  apiece  dominates  the  time 
of  phase  j.  Since  the  j's  are  bounded  by  a  geometric  series,  the  total  time  complexity  is 
O(nlogn).       I 

Note  that  the  theorem  allows  the  random  sample  to  contain  duplicates.  In  fact,  the  theorem 
holds  even  when  |£|  is  smaller  than  (1  +  ■\/2)-^. 

To  compare  this  theorem  to  Theorem  3.2,  let  £  be  the  set  of  (j)  non-equivalent  lines  that 
separate  the  points  of  P.  We  can  obtain  a  random  line  of  £  by  choosing  a  random  pair  of  points 
and  rotating  slightly  the  line  that  they  define  (note  that  the  rotation  is  only  conceptual).  Thus 
the  average  stabbing  number  is  A{y/n)  and  we  can  be  assured  that  at  least  half  of  the  lines  have 
stabbing  number  at  most  twice  the  average. 

Theorem  3.4  is  useful  for  applications  in  which  the  average  stabbing  number  is  important,  but 
the  worst-case  stabbing  number  is  not.  The  calculation  of  many  faces  in  an  arrangement  of  lines, 
described  in  Section  8,  is  such  an  application. 

We  use  an  iterative  approach  to  obtain  a  good  worst-case  stabbing  number.  The  algorithm  of 
Theorem  3.4,  running  on  a  set  of  lines  £,  produces  a  tree  T  with  the  property  that  at  least  half  of 
the  lines  of  £  are  good:  they  have  stabbing  number  A('y/n).  K  we  could  identify  which  lines  were 
bad,  then  we  could  run  the  algorithm  again  on  these  lines.  Repeating  this  process  a  logarithmic 
number  of  times,  we  would  form  a  family  of  trees  such  that  every  line  has  stabbing  number  A{y/n) 
with  respect  to  one  of  the  trees.  To  identify  the  bad  lines  explicitly  seems  expensive;  however,  by 
one  more  application  of  random  sampling,  we  can  approximate  the  set  of  bad  lines  closely  enough 
to  establish  the  following  theorem. 

Theorem  3.5  Given  a  configuration  of  n  points  P  and  a  constant  a  >  0,  we  can  compute  a  family 
ofk  =  0{logn)  spanning  trees  Ti,T2,..  .,Tk  on  P  with  the  property  that,  with  probability  1  —  n'" 
for  any  a'  <  a,  for  every  line  I  there  is  a  tree  T,  such  that  I  has  stabbing  number  A{y/n)  with 
respect  to  Ti.  The  procedure  uses  A{n^'^)  time  and  linear  space  for  each  tree  in  the  family,  for  a 
total  of  A{n^l^)  time  and  0{n\ogn)  space.  Again,  a  enters  as  a  multiplicative  factor  into  the  time 
complexity  bound. 

Proof:  To  aid  our  presentation,  we  number  the  bad  sets  and  the  spanning  trees  according 
to  the  order  in  which  they  are  produced.  Tree  T,  is  built  based  on  the  bad  set  £,-,  and  bad 
set  £,+1  is  defined  by  £,  and  T,.  The  first  set,  £i  =  £,  is  the  set  of  Q)  non-equivalent  lines 
that  separate  P  in  all  possible  ways.  The  first  spanning  tree,  Ti,  is  good  for  at  least  |£i|/2 
lines.  We  would  like  £2  to  include  the  bad  lines  and  none  of  the  good  lines;  more  generally, 
we  would  like  £,+1  to  include  the  bad  lines  of  £,  and  none  of  the  good  lines.  However,  since 
we  do  not  allow  ourselves  enough  time  to  count  all  intersections  between  £,  and  TJ,  this 
seems  to  be  asking  too  much. 


We  resolve  the  problem  by  letting  the  set  £,  approximate  the  set  of  bad  lines.  There  are 
two  challenges  to  be  met:  First,  we  cannot  let  bad  lines  slip  out  of  the  approximation,  yet 
we  need  to  reduce  the  size  of  £,  by  a  constant  factor  at  each  step.  Second,  we  cannot  store 
£,  explicitly — it  is  too  large.  We  need  to  store  it  in  an  implicit  form  that  still  allows  us  to 
choose  a  random  sample  of  (1  +  y/2)y/n  lines  quickly.  It  is  the  cost  of  sampling  from  £,  that 
dominates  the  cost  of  constructing  the  spanning  trees:  by  Theorem  3.4,  each  spanning  tree 
construction  takes  0(n  log  n)  time,  given  a  random  sample  from  d  of  size  0{y/n),  but,  as 
Theorem  3.8  below  shows,  finding  the  random  sample  takes  A(n3/2)  tj^g 

Our  method  for  computing  the  set  £,  uses  the  fact  that  a  random  sample  of  the  edges 
of  T,  is  a  good  estimator  for  the  number  of  intersections  of  a  line  with  Tj:  if  we  take  an 
appropriately-sized  random  sample  of  tree  edges,  we  expect  bad  lines  to  reveal  themselves 
by  intersecting  the  sample. 

We  pick  a  random  sample  R  ol  t  =  C2>/n/logn  tree  edges,  where  the  constant  C2 
will  be  chosen  so  that  at  most  |£,|/2  lines  intersect  the  sample.  The  lines  of  d  that 
intersect  an  edge  of  the  sample  will  constitute  the  new  bad  set  approximation,  £i+i.  In 
the  next  two  paragraphs,  we  will  show  how  to  choose  the  constant  cj  to  obtain  a  sample 
with  at  most  |£i|/2  points  of  intersection  between  £,  and  R — this  will  certainly  ensure  that 
l^i+il  <  1-^.1/2.  Then  we  will  show  that  the  lines  that  that  do  not  intersect  an  edge  of  R 
stab  few  edges  of  T^. 

How  do  we  bound  the  number  of  intersections  between  d  and  our  random  sample  R1 
The  construction  of  Ti  gives  a  bound  on  the  number  of  intersections  between  each  edge  of  the 
tree  and  the  lines  of  £,.  In  fact,  the  construction  of  T,  gives  a  bound  on  the  number  of  lines 
passing  through  the  triangle  containing  edge  e  when  e  is  added  to  the  tree;  the  algorithm  can 
store  that  bound  as  B{e).  Since  the  average  bound  of  the  edges  of  T,  is  2ci|£i|  log  n/y/n,  the 
expected  number  of  intersections  between  £,  and  R  (the  random  sample  of  r  =  c^^/n/ logn 
edges  of  T.)  is  bounded  by  2ciC2|£,|.  Choose  C2  so  that  2ciC2  =  1/4.  Then  at  least  half  of  all 
random  samples  have  at  most  |£,i/2  intersections  with  £,,  according  to  the  bounds  B{e). 

For  a  particular  sample  R,  we  sum  the  intersection  bounds  of  its  edges,  JZeefl  ^(*)i  ^^ 
order  to  bound  the  number  of  intersections  between  R  and  £j.  If  the  bound  is  greater  than 
\Ci\/2,  we  pick  another  sample.  Since  summing  the  bounds  takes  only  0{y/n)  time,  in  linear 
time  we  can  choose  up  to  ^/n  samples  if  necessary.  Thus,  with  probability  1  —  2"^",  we 
can  guarantee  that  the  sum  of  the  bounds  is  at  most  |£i|/2.  If  the  construction  of  tree  Ti 
succeeded,  then  the  bounds  are  accurate,  and  |£,+i|  <  |£i|/2. 

We  use  c-nets  to  bound  the  stabbing  number  of  lines  of  £<  —  £,+i.  The  elements  of 
the  range  space  are  tree  edges  and  the  ranges  are  lines.  We  again  choose  6  to  be  n~°;  this 
implies  that  e  =  0(log  n/r).  In  this  application  of  random  sampling,  we  reject  as  many 
as  half  of  the  samples,  so  the  proportion  of  times  the  sample  fails  to  be  an  f-net  may  as 
much  as  double.  But  we  can  still  compute  a  constant  c'  such  that  with  probability  at  least 
1  -  271"°,  any  line  that  cuts  at  least  c'y/nlog^  n  edges  of  T,  also  cuts  a  line  of  the  random 
sample.  Therefore,  each  line  deleted  from  £,•  has  stabbing  number  A{y/n)  with  respect  to 
tree  T,. 

What  is  the  probability  that  aU  the  bounds  on  stabbing  numbers  are  correct,  and  hence 
that  the  algorithm  succeeds?  With  probability  l-2~^,  we  can  find  a  random  sample  R  that 
has  a  low  intersection  bound  with  Ci.   Independently,  each  sample  has  the  c-net  property 
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with  probability  1  -  2n  ",  and  each  spanning  tree  T,  has  the  right  stabbing  bounds  with 
probability  (1  -  n~"y°8".  Thus  the  overall  probability  of  success  is  at  least 

(1  -  2n-'*  -  2"^)^'°8"(1  -  n"°)2^«^" 

>   l-(4logn  +  21og^n  +  l)n-°'     >     1  -  n""' 

for  any  q'  <  a  as  n  goes  to  infinity. 

Let  us  summarize  our  algorithm.  In  a  sequence  of  k  =  O(log  n)  stages,  the  algorithm 
produces  k  spanning  trees  for  P.  Stage  »  draws  a  random  sample  of  size  (1  +  y/2)y/n  from 
the  approximate  bad  set  £,.  We  will  see  in  Theorem  3.8  that  drawing  the  sample  from 
£,  takes  0(n"'/^  log  n)  time  and  lineaj  space  for  all  stages  after  the  first  (sampling  from 
C\  =  C  takes  0{y/n)  time).  The  algorithm  uses  the  random  sample  from  Ci  to  compute  T, 
in  O(nlogn)  time  by  the  method  of  Theorem  3.4.  It  then  draws  a  random  sample  of  the 
edges  of  Ti  and  defines  £,+1  implicitly  as  the  set  of  lines  in  £,  that  intersect  these  edges. 
To  bound  the  time  and  space  required  by  the  algorithm,  we  impose  a  niixed  termination 
condition:  the  aJgorithm  terminates  when  £,+1  is  empty  or  after  21ogTi  stages,  whichever 
comes  first.  With  high  probability,  £,+1  contains  at  most  half  the  lines  of  d  at  each  stage, 
so  the  algorithm  almost  always  terminates  with  £,+1  empty.  Let  k  be  the  number  of  trees 
built  by  the  algorithm.  The  space  needed  to  store  the  trees  is  0{kn)  =  0{n\ogn),  and 
the  time  used  to  build  them  is  O(n^'^log^  n).  The  properties  of  random  samples  ensure 
that,  with  probability  at  least  1  —  n~°  for  any  a'  <  a,  every  line  (  has  stabbing  number 
0{y/n\og^  n)  with  respect  to  at  least  one  of  Ti,T2,...  ,Tk.      I 

3.3     Random  sampling  from  a  bad  set 

Theorem  3.5  defines  the  approximate  bad  set  £,+1  as  the  set  of  all  lines  in  £,  that  intersect  a 
random  Scimple  of  the  edges  of  T,.  This  subsection  shows  how  to  sample  from  £,+1  without  looking 
at  all  of  its  lines.  We  begin  by  considering  the  definition  of  the  lines  in  £1  =  £. 

Each  line  ^  of  £  is  determined  by  two  points  of  P:  we  draw  (  through  the  two  points  tind  rotate 
it  slightly  clockwise,  so  that  it  passes  above  the  left  point  and  below  the  right  one.  But  recall  that 
the  points  of  P  are  really  the  duals  of  the  original  n  lines  of  L.  In  primal  space,  then,  t  corresponds 
to  an  intersection  of  two  original  lines,  perturbed  slightly  into  one  of  the  four  quadrants  defined 
by  the  lines  (the  quadrant  above  the  line  whose  dual  is  the  left  point,  and  below  the  line  whose 
dual  is  the  right  point).  We  refer  to  the  primal  version  of  ^  as  the  canonical  perturbation  of  its 
corresponding  intersection.  Line  t  cuts  a  (dual)  segment  if  ^'s  corresponding  primal  point  lies  in 
a  double  wedge  corresponding  to  the  segment.  (The  bounding  lines  of  the  double  wedge  ase  the 
primal  versions  of  the  segment  endpoints.)  Define  W,  to  be  the  (primal)  arrangement  of  the  double 
wedges  that  correspond  to  all  the  randomly  sampled  edges  from  Ti,...,Ti-\.  Each  face  of  Wi 
dualizes  to  the  set  of  all  lines  that  cut  a  particular  subset  of  the  sample  edges.  A  line  is  in  £,■  if 
and  only  if  it  cuts  a  sampled  edge  from  each  tree  Ti, . . .  ,T,_i;  equivalently,  a  line  belongs  to  d  if 
and  only  if  its  primal  version  falls  in  one  of  a  certain  set  of  faces  of  Wi.  We  think  of  these  faces 
as  being  colored.  Translating  the  definition  of  £i+i  in  Theorem  3.5  to  primal  terms,  we  get  the 
colored  faces  of  Wi+i  by  intersecting  the  colored  faces  of  Wi  with  the  set  of  double  wedges  that 
correspond  to  the  random  sample  from  T..  Since  Wi  has  complexity  0(ni'^/  log'  n)  =  0(n),  we  can 
build  it  and  determine  its  colored  faces  in  linear  time  [CGL85,  EOS86]. 
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The  arrangement  W,  defines  £,  implicitly.  To  build  T,  based  on  £,,  we  need  to  draw  random 
samples  from  Ci-  In  primal  terms,  we  want  to  pick  a  random  sample  from  the  intersections  of  the 
lines  of  L  (the  original  n  lines)  that  fall  in  colored  faces  of  W,-,  or  whose  canonical  perturbations 
do  so.  We  refer  to  these  intersections,  which  fall  in  colored  faces,  a£  being  colored  themselves.  To 
enumerate  all  the  intersections  of  primal  lines  explicitly  would  take  0{-n})  time;  we  enumerate  them 
implicitly  using  A(n^/^)  time  and  0{n)  space.  Our  implicit  enumeration  is  based  on  the  fact  that 
there  are  only  0(n^/^i/ log  n)  intersections  between  the  n  ori^nal  lines  of  L  aaid  the  edges  of  W,. 
We  compute  these  intersections,  then  use  them  to  count  and  sample  from  the  colored  intersections. 

The  idea  of  the  sampling  procedure  is  the  following:  We  count  the  colored  intersections,  im- 
plicitly assigning  a  unique  integer  index  to  each.  We  then  pick  a  random  sample  from  the  integers 
between  1  and  the  number  of  colored  intersections.  Our  counting  algorithm  lets  us  explicitly  pro- 
duce the  intersection  corresponding  to  any  given  index,  so  we  run  the  algorithm  again  to  produce 
the  intersections  that  correspond  to  the  randomly  chosen  integers.  These  intersections  constitute 
the  desired  random  sample. 

The  algorithm  for  counting  and  computing  intersections  has  two  distinct  parts,  one  dealing  with 
a  single  face  of  VT,  and  one  dealing  with  all  the  faces.  The  single-face  algorithm  takes  as  input  the 
set  of  all  lines  of  L  that  intersect  the  boundary  of  a  given  colored  face.  It  counts  the  intersections 
of  lines  of  L  that  fall  inside  the  face  and  implicitly  assigns  indices  to  them.  Given  a  sorted  set  of 
indices,  it  can  produce  the  intersections  corresponding  to  them.  The  global  part  of  the  algorithm 
provides  the  input  to  the  single-face  algorithm.  It  enumerates  the  colored  faces  of  VT,  and  computes 
the  intersections  of  L  with  the  boundary  of  each  face.  Both  parts  of  the  algorithm  use  C>(logn) 
time  per  intersection  between  L  and  VV,  and  linear  space  altogether.  The  following  lemma  describes 
the  single-face  algorithm. 

Lemma  3.6  Let  f  he  a  colored  face  of  VF,,  and  let  H  be  the  set  of  lines  of  L  that  intersect  the 
boundary  of  f.  Define  h  =  \n\.  If  we  are  given  H  as  input,  then  in  O(h\ogh)  time  and  0(h) 
space  we  can  count  the  intersections  of  L  inside  f,  including  those  on  the  boundary  whose  canonical 
perturbations  lie  in  f.  Let  the  number  of  these  intersections  be  I.  The  counting  algorithm  implicitly 
assigns  a  unique  integer  in  the  range  1  to  I  to  each  intersection;  given  a  sorted  list  of  integers 
between  1  and  I,  the  algorithm  can  produce  the  corresponding  intersections  in  O(log  n)  additional 
time  apiece  during  the  counting  process. 

Proof:  We  restrict  our  attention  to  the  lines  in  H,  since  any  line  of  L  that  intersects  the 
interior  of  face  /  must  cross  the  boundary  of  /,  and  therefore  must  belong  to  H .  Our 
algorithm  has  two  steps,  one  to  count  intersections  on  the  boundary  of  /  and  one  to  count 
intersections  inside  /. 

The  first  step  of  the  algorithm  sorts  the  intersections  of  H  with  the  boundary  of  the  face 
/  in  order  around  /.  This  takes  0{h\ogh)  time.  (Because  U  includes  the  supporting  lines 
of  /,  our  cinalysis  does  not  need  to  explicitly  include  the  complexity  of  the  boundary  of  /.) 
From  the  sorted  list  the  algorithm  determines  the  0(h)  intersections  of  lines  of  L  that  fall 
on  the  boundary  of  /.  It  counts  explicitly  the  ones  whose  canonical  perturbations  fall  inside 
the  face  /. 

The  second  step  of  the  algorithm  counts  the  intersections  of  lines  in  the  interior  of  /. 
It  considers  only  the  transverse  intersections  between  the  lines  of  H  and  the  boundary  of 
/,  those  whose  lines  cut  the  interior  of  /.  It  numbers  these  intersections  in  order  around  / 
from  1  to  h',  for  some  h'  <  2h.  The  intersections  are  paired:  each  intersection  stores  the 
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index  of  the  other  intersection  of  its  line  with  the  boundary  of  /.  The  algorithm  to  count  the 
intersections  inside  the  face  is  similar  to  counting  inversions  in  a  permutation  [Knu73].  The 
algorithm  maintains  a  list  S  of  active  intersections,  which  it  initializes  to  empty,  and  a  count 
count  of  intersections  noted  so  far.  The  counting  algorithm  scans  through  the  intersection 
indices  in  order  from  1  to  h'.  At  each  step,  let  a  be  the  current  index.  The  line  that  intersects 
the  face  /  at  a  cuts  it  a  second  time,  at  a  point  with  index  b.  If  o  <  6,  then  neither  is  in  5; 
we  append  a  to  5.  On  the  other  hand,  if  a  >  6,  then  b  already  belongs  to  5;  we  locate  6  in 
S,  set  c  to  be  the  number  of  elements  of  S  after  6,  add  c  to  count,  and  delete  b  from  S.  The 
same  technique  has  been  used  in  [GOS87]  to  count  intersections  between  line  segments. 

The  correctness  of  the  second  step  of  the  counting  algorithm  rests  on  the  observation 
that  the  quantity  c  is  the  number  of  lines  that  intersect  segment  ab  and  have  not  yet  had 
their  intersections  counted.  The  list  5  is  a  sorted  list  of  integers  that  supports  the  operations 
append,  find,  find-by-index  (given  j,  return  the  j^^  element  of  5),  delete,  and  length-of-tail 
(return  the  number  of  elements  after  the  given  one).  A  balanced  binau-y  tree  structure 
provides  these  operations  in  0(\ogh)  time  apiece.  The  algorithm  performs  0(h)  operations, 
each  of  which  takes  0{\ogh)  time;  this  gives  an  0{h\ogh)  overall  time  bound.  The  space 
requirement  is  0(h). 

Given  a  sorted  list  of  indices,  the  algorithm  can  produce  the  corresponding  intersections 
in  order  as  it  runs.  The  intersections  on  the  boundary  of  /  are  trivial  to  produce.  Consider 
an  index  x  belonging  to  an  intersection  in  the  interior  of  /.  The  algorithm  runs  as  described 
above  until  count  <  x  <  count  +  c.  Then  the  desired  intersection  is  the  one  involving  ab 
and  the  line  whose  intersection  with  the  face  boundary  is  distance  (x  —  count)  after  b  in  S. 
The  intersection  is  computed  using  find-by-index.      | 

We  have  seen  how  to  count  the  intersections  of  the  n  original  lines  of  L  that  fall  inside  a  face  of 
Wi,  given  the  intersections  of  L  with  the  boundary  of  the  face.  We  must  now  show  how  to  enumerate 
the  colored  fax:es  of  Wi  in  some  order,  and  for  each  face  find  all  intersections  of  L  with  its  boundary. 
Our  goal  is  to  do  the  enumeration  in  0(71)  space  and  in  O(logn)  time  per  intersection  between  L 
and  Wi.  We  achieve  these  bounds  by  enumerating  the  faces  along  each  line  of  Wi  successively  (each 
double  wedge  contributes  two  lines  to  Wi). 

In  outline,  our  method  is  the  following:  We  associate  each  colored  face  of  Wi  with  the  unique 
supporting  line  below  it  of  minimum  slope.  Our  duality  transform  ensures  that  this  line  exists, 
since  no  double  wedge  can  contain  the  bottom  face  of  Wi  (cf.  the  proof  of  Theorem  3.2).  Next  we 
step  through  the  lines  of  Wi  in  some  arbitrary  order.  For  each  line  i  6  Wi,  we  find  the  intersections 
of  L  with  the  faces  associated  with  I,  then  count  the  intersections  in  each  such  face.  We  process 
one  face  at  a  time,  taking  them  in  left-to-right  order.  The  algorithm  uses  the  following  ordering 
lemma: 

Lemma  3.7  Let  fi,/^,---  be  the  faces  of  an  arrangement  that  are  incident  to  a  line  I  of  the 
arrangement  and  lie  on  one  side  of  i  (say  above  £).  Then  any  other  line  £'  that  intersects  some  of 
/i,  /2, . . .  intersects  them  in  a  left-to-right  order  consistent  with  their  order  along  I,  or  the  reverse 
of  that  order. 

Proof:  Suppose  to  the  contrary  that  faces  a,b,c  appear  in  left-to-right  order  on  t  but  in 
order  a,  c,  b  on  I'.  (AU  possible  mis-orderings  are  equivalent  to  this  one.)  Consider  the  line  4 
of  the  arrangement  whose  intersection  with  i  creates  the  left  vertex  of  c  on  I.  See  Figure  2. 
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Now  a  and  b  lie  on  the  opposite  side  of  4  from  c,  as  do  all  predecessors  of  c  along  £.  For  t 
to  intersect  a,  c,  and  b  in  that  order,  it  must  cross  4  twice,  which  is  impossible.       | 


Figure  2:  Any  line  intersects  faces  a,  b,  and  c  in  order 

This  lemma  lets  us  produce  the  intersections  of  a  line  /'  with  the  faces  associated  with  £  in  left- 
to- right  order  (with  respect  to  £).  We  can  use  ray-shooting  [CG85]  or  fractional  cascading  [CG86] 
to  compute  which  of  these  faces  £'  hits  in  order  along  £';  by  the  lemma,  this  order  is  consistent  with 
the  order  of  the  faces  along  £.  We  use  a  next-face  function  that,  given  £'  and  its  current  fax:e,  in 
0{logn)  time  computes  the  next  face  associated  with  £  that  £'  hits. 

Given  a  line  £  of  Wi,  our  method  begins  by  scanning  through  the  colored  faces  incident  to  /  and 
determining  which  of  them  are  associated  with  /;  we  speak  of  these  faces  as  being  flagged  in  the 
remainder  of  this  paragraph.  The  scan  takes  0{iy/n/ \ogn)  time  [CGL85,  EOS86].  Each  flagged 
face  has  a  bucket  that  holds  lines  of  X;  each  line  of  L  goes  in  at  most  one  bucket.  We  maintain 
the  invariant  that  when  a  flagged  face  is  processed,  its  bucket  holds  all  the  lines  of  L  that  intersect 
the  face.  We  initialize  the  buckets  by  using  next-face  to  place  each  line  of  L  in  the  bucket  for 
the  leftmost  flagged  face  (with  respect  to  £)  that  it  intersects.  We  next  scan  through  the  flagged 
faces  along  £  from  left  to  right.  When  we  reach  face  /,  we  count  the  intersections  of  L  inside  /  as 
described  in  Lemma  3.6.  When  we  finish  with  face  /,  we  apply  next-face  to  each  line  £'  of  L  that 
hits  /,  obtaining  the  next  intersection  of  £'  with  a  flagged  face.  We  transfer  £'  from  the  bucket  for 
/  into  the  bucket  for  the  next  intersected  face.  This  guarantees  that  when  each  face  is  processed, 
its  bucket  contains  all  the  lines  of  L  that  intersect  it,  as  required. 

This  method  lets  us  count  the  intersections  of  I  that  fall  in  colored  faces  of  W,.  How  do  we 
use  this  to  take  a  random  sample  from  the  colored  intersections?  Once  we  know  how  many  colored 
intersections  there  are  altogether,  we  generate  a  random  sample  of  (1  +  y/2)y/n  integers  between 
1  and  the  number  of  colored  intersections,  then  run  the  algorithm  again,  this  time  using  it  to 
produce  the  intersections  indexed  by  the  random  numbers.  This  produces  a  random  sample  of  the 
colored  intersections. 

What  is  the  time  and  space  complexity  of  this  algorithm?  For  each  line  of  W,,  the  algorithm  uses 
O(nlogn)  time  to  initialize  the  buckets.  It  takes  O(logn)  time  to  find  each  face-line  intersection. 
The  face  processing  costs  also  amount  to  O(logn)  per  intersection.  Therefore,  since  there  are 
0(n^/^i71ogn)  intersections  between  L  and  V^i,  selecting  a  random  sample  from  £,  takes  Oin^'^i) 
time,  or  a  total  of  0{n^l'^\o^  n)  for  all  1  <  t  <  21ogn.  The  space  complexity  is  linear:  0{n)  for 
the  arrangement  V^,,  plus  an  additional  0{n)  for  the  buckets. 

We  have  established  the  following  theorem,  which  was  required  in  the  proof  of  Theorem  3.5: 
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Theorem  3.8  Given  the  set  of  spanning  tree  edges  from  trees  T\,...  ,r,_i  that  determines  the  set 
of  lines  Ci,  we  can  define  the  arrangement  W,  of  the  double  wedges  whose  duals  are  these  tree  edges. 
We  define  some  faces  ofWi  to  be  colored  in  accordance  with  the  definition  of  d:  a  line  of  C  belongs 
to  d  if  and  only  if  it  is  the  dual  of  a  point  lying  in  a  colored  face  ofWi.  We  can  build  Wi  in  0{n) 
time  and  space.  Using  Wi,  we  can  take  a  random  sample  of  size  r  from  Ci  in  0{{n^l'^  +  r)logn) 
time  and  0(n  +  r)  space. 

3.4     Untangling  a  spanning  tree 

The  aJgorithms  of  Theorems  3.2  and  3.4  can  generate  trees  that  have  self-intersections — the  trees 
may  be  tangled.  In  order  to  use  these  trees  in  our  application,  we  must  untangle  them  without 
increasing  their  stabbing  numbers  by  too  much.  We  give  an  algorithm  that  takes  the  output  of 
the  algorithm  of  Theorem  3.4  and  untangles  it — returning  a  tree  without  self-intersections — in 
O(nlogn)  time.  The  algorithm  can  be  adapted  to  untangle  a  tree  from  Theorem  3.2  in  A(n^/^) 
time.  The  algorithm  has  two  parts.  First,  we  form  a  tree  with  no  self-intersections  that  has  at 
most  n  additional  vertices,  or  Steiner  points.  Second,  we  eliminate  the  Steiner  points  by  employing 
a  result  of  Guibas  and  Hershberger  [GH87]  to  find  shortest  paths  from  vertex  to  vertex  using  the 
tree  as  a  simple  polygonal  obstacle.  The  first  paxt  does  not  increase  the  stabbing  number  of  any 
line;  the  second  at  worst  doubles  the  stabbing  number  of  any  line. 

At  times  during  the  zdgorithm,  we  want  to  think  of  an  untangled  tree  as  a  simple  polygon.  We 
can  do  so  by  tracing  am  Eulerian  tour  around  the  tree  and  treating  it  as  a  simple  polygon  with  twice 
as  many  edges.  As  a  result,  many  basic  algorithms  designed  for  simple  polygons  apply  to  our  trees 
as  well.  For  example,  Chazelle  and  Guibas  [CG85]  give  an  algorithm  that  can  be  adapted  to  our 
structure  so  that,  with  O(nlogn)  preprocessing  and  0(n)  space,  we  cam  answer  "shooting  queries" 
in  O(logn)  time.  A  shooting  query  has  the  following  form:  given  a  ray  from  a  point  p,  report  the 
intersection  point  closest  to  p  of  the  ray  with  an  untangled  tree.  This  will  be  an  important  tool  in 
the  proof  of  Theorem  3.9. 

Theorem  3.9  Suppose  we  are  given  the  edges  of  a  tangled  spanning  tree  on  n  points  that  are  output 
by  the  algorithm  of  Theorem  3.4-  Then  we  can  construct  an  untangled  tree  with  at  most  n  additional 
(Steiner)  points  by  using  C>(n  log  n)  time  and  linear  space.  For  any  line,  the  stabbing  number  with 
respect  to  the  untangled  tree  will  be  at  most  the  stabbing  number  with  respect  to  the  tangled  tree. 

Proof:  Recall  that  Theorem  3.4  constructs  a  spanning  tree  in  phases,  so  that  the  new  edges 
generated  within  each  phase  do  not  intersect  one  another.  We  build  the  desired  Steiner 
tree  (a  non-self-intersecting  tree  with  at  most  n  additional  vertices,  or  Steiner  points)  in  an 
incremental  manner,  adding  the  edges  one  at  a  time  in  a  way  that  makes  the  union  of  all 
added  edges  connected  at  all  times.  That  is,  we  now  add  edges  in  the  order  opposite  to  the 
one  in  which  the  edges  were  found.  We  say  that  the  untangling  algorithm  runs  in  stages  to 
distinguish  them  from  the  phases  of  the  tree  construction  algorithm.  We  call  the  points  of 
the  configuration  P  real  points  or  real  vertices. 

In  the  beginning  of  stage  j,  we  assume  that  we  have  a  Steiner  tree  with  at  most  j  (real 
and  Steiner)  vertices  and  without  self- intersections.  We  can  preprocess  the  Steiner  tree  for 
shooting  queries  as  mentioned  above.  We  also  have  several  trees  made  up  of  the  edges  that 
were  found  in  phase  j  of  the  tree  construction  algorithm — since  each  one  is  rooted  on  the 
Steiner  tree,  we  call  them  rooted  trees.   Taking  the  Steiner  tree  along  with  all  the  rooted 
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trees  gives  a  spanning  tree  of  all  j  points.  Furthermore,  no  two  edges  from  rooted  trees 
intersect. 

Within  each  rooted  tree,  number  the  vertices  by  a  preorder  traversal  from  the  root;  a 
child  receives  a  higher  number  than  its  parent.  Order  the  edges  by  labeling  them  with  the 
greater  of  their  two  endpoint  numbers.  If  we  respect  this  order  as  we  add  edges  to  the 
Steiner  tree,  then  the  path  from  a  vertex  p  to  the  root  is  added  before  any  edge  into  p  from 
one  of  its  children. 

Suppose  we  want  to  add  the  edge  (c,p),  where  vertex  c  has  a  larger  number  than  vertex  p. 
Shoot  from  the  child  c  toward  the  parent  p;  the  shot  can  hit  either  p  or  an  edge  of  the  Steiner 
tree.  In  the  former  case,  we  add  edge  ic,p)  to  the  tree.  In  the  latter,  we  introduce  a  Steiner 
point  3  at  the  first  intersection  point  from  c  towards  p,  and  add  edge  (c,s).  Point  a  splits  a 
Steiner  tree  edge  into  two  edges. 

This  process  maintains  connectivity  and  clearly  introduces  no  cycles,  since  it  always  adds 
as  many  vertices  as  edges. 

At  the  end  of  stage  j,  we  have  performed  shooting  from  a  total  of  j— 1  vertices  and  so  have 
introduced  at  most  j  -  1  Steiner  points.  We  have  a  Steiner  tree  with  no  self-intersections 
and  less  than  2j  vertices.  The  number  of  vertices  will  at  least  double  at  the  next  stage. 
Thus,  we  satisfy  the  initial  assumption  of  the  next  stage  and  can  proceed  to  it. 

After  stage  n,  we  have  a  Steiner  tree  with  fewer  than  n  Steiner  vertices.  Since  the 
shooting  process  merely  shortens  the  original  edges,  the  stabbing  number  of  a  line  with 
respect  to  the  tree  cannot  increase.  The  time  taken  is  bounded  by  a  geometric  sum: 

oi    Yl    Jlogj)  =C»((n  +  2  +  ---+l)logn)  =  0(nlogn). 

\  for  each  / 

■tage  j 

This  completes  the  algorithm.       | 

We  can  use  the  same  approach  to  build  a  Steiner  tree  from  the  output  of  Theorem  3.2,  or,  in 
fact,  from  any  tangled  tree.  We  traverse  the  tangled  tree  in  preorder.  When  we  visit  a  vertex  v, 
we  shoot  along  the  edge  from  v  toward  its  parent,  then  add  to  the  Steiner  tree  the  edge  from  v  to 
the  first  intersection  of  the  shot  with  the  current  Steiner  tree.  Determining  the  first  intersection 
is  slow,  in  generzd,  but  in  the  special  case  of  a  tree  with  low  stabbing  number,  we  can  determine 
all  intersections  of  tree  edges,  and  hence  the  ones  we  need,  in  A{ns)  time,  where  s  is  the  stabbing 
number  of  the  tree  [B079,  CE88]. 

In  our  application,  additional  Steiner  vertices  in  the  tree  correspond  to  additional  lines  in  the 
primal  arrangement.  These  lines  could  hide  true  edges  or  introduce  dummy  edges  in  the  faces  of 
the  arrangement  and  thus  affect  the  answers  to  queries.  To  remove  the  Steiner  points,  we  use  the 
shortest  path  algorithm  of  Guibas  and  Hershberger  [GH87]  as  follows:  Consider  an  Eulerian  tour 
r  around  the  outside  of  the  tree;  it  visits  both  Steiner  points  and  real  vertices  and  has  stabbing 
number  at  most  double  that  of  the  Steiner  tree.  For  any  pair  of  real  vertices  p  and  q  such  that 
the  portion  of  the  tour  r  between  them  visits  only  Steiner  points,  we  replace  this  portion  by  the 
shortest  path  from  p  to  7  that  does  not  intersect  the  Steiner  tree.  We  show  that  the  resulting 
straight-edge  multigraph  G  is  connected,  has  only  real  points  for  vertices,  and  has  at  most  0{n) 
edges.  The  edges  of  G  do  not  cross  each  other.  Moreover,  the  stabbing  number  of  G  with  respect 
to  any  line  is  at  most  that  of  r. 
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Two  real  vertices  p  and  q  are  said  to  be  adjacent  on  the  tour  if  the  tour  contains  a  path  from  p 
to  q  that  uses  only  Steiner  points.  We  call  the  path  a  Steiner  path  and  denote  it  by  Stp{p,q).  Let 
sp{p,  q)  denote  the  Euclidean  shortest  path  from  p  to  g  that  avoids  the  tree.  In  our  algorithm,  we  will 
replace  the  Steiner  path  Stp{p,q)  between  adjacent  vertices  p  and  q  by  the  shortest  path  sp{p,q). 
This  may  turn  our  tour  into  a  graph  with  higher  degree  vertices,  but  it  maintains  connectivity 
and,  as  we  show,  it  does  not  increase  the  stabbing  number  of  any  line.  We  have  one  important 
observation  about  Steiner  points. 

Lemma  3.10    With  respect  to  the  Eulerian  tour,  all  Steiner  points  are  convex  vertices. 

Proof:  Steiner  points  are  introduced  along  an  edge,  forming  one  angle  of  180°  and  two 
angles  that  sum  to  180°.       I 

We  call  a  simple  polygon  a  spiral  from  p  to  9  if  the  two  polygonal  chains  from  p  to  g  both  turn 
only  to  the  left  or  only  to  the  right. 

Lemma  3.11  Between  adjacent  real  vertices  p  and  q,  the  Steiner  path  and  shortest  path,  Stp{p,q) 
and  sp{p,q),  form  a  spiral  from  p  to  q.  This  spiral  contains  no  vertices  of  the  Steiner  tree  in  its 
interior. 

Proof:  By  construction,  the  Steiner  path  goes  along  the  tree  and  the  shortest  path  never 
touches  the  Steiner  path,  as  shown  in  Figure  3.  Thus,  concatenating  the  Steiner  and  shortest 
paths  gives  a  simple  polygon,  5. 


Figure  3:  The  Steiner  and  shortest  paths  form  a  spiral 

If  we  take  the  convex  hull  of  our  tree,  <dl  the  huU  vertices  are  real  vertices.  Thus  we  will 
never  have  a  Steiner  path  that  leaves  one  bay  or  pocket  of  the  hull  and  goes  to  another. 
The  shortest  path  sp{p,q)  lies  in  the  same  bay  as  Stp(p,q).  Because  the  polygon  S  does  not 
cross  the  Steiner  tree,  it  must  contain  either  all  or  none  of  it;  because  S  lies  in  a  single  bay 
of  the  convex  hull,  it  therefore  contains  no  vertex  of  the  Steiner  tree. 

Lemma  3.10  implies  that  there  axe  no  reflex  vertices  on  the  Steiner  path,  so  it  turns  in 
only  one  direction.  K  the  path  sp{p,q)  made  a  convex  turn  with  respect  to  the  inside  of  5, 
then  the  path  could  be  shortened.  Thus  the  shortest  path  turns  in  only  one  direction — the 
same  direction  as  the  Steiner  path.       | 

Since  the  shortest  paths  have  reflex  angles  with  respect  to  the  tour,  they  use  no  Steiner  points. 
Thus  the  graph  of  all  shortest  paths  between  adjacent  real  vertices  uses  only  real  vertices. 

We  can  now  prove  that  the  stabbing  number  of  the  shortest  path  sp{p,  q)  is  less  than  or  equal 
to  that  of  the  Steiner  path  Stp{p,q).  In  a  spiral,  the  chain  with  reflex  angles  on  the  inside  of  the 
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spiral  is  called  the  inner  chain,  and  the  other  chain  is  the  outer  chain.  For  example,  shortest  path 
sp{p,q)  is  the  inner  chain  of  the  spiral  from  p  to  q.  Up  ajid  q  are  both  on  the  convex  hull  of  the 
spiral  from  p  to  q,  then  we  call  it  a  short  spiral. 

Lemma  3.12  Let  S  be  a  spiral  from  p  to  q.  Any  line  intersects  the  outer  chain  at  least  as  often 
as  the  inner  chain. 

Proof:  If  the  spiral  is  short,  then  the  conclusion  holds.  Thus  we  will  cut  a  long  spiral  into 
several  short  spirals  by  the  following  process.  Consider  walking  along  the  inner  chain  from 
p  to  q,  maintaining  a  point  w  on  the  inner  chain  and  a  ray  that  is  tangent  to  the  inner  chain 
at  w  and  opposite  the  direction  of  the  walk.  As  we  walk  on  the  inner  chain  from  p,  this  ray 
sweeps  a  point  w'  on  the  outer  chain. 


Figure  4:  Cutting  a  long  spiral  along  a  ray 

At  the  beginning  of  the  walk,  choose  a  direction  d — perhaps  the  direction  of  the  first 
edge  on  the  walk.  Whenever  our  walk  faces  direction  d  or  —d,  we  cut  the  spiral  along  the 
edge  {w,w').  See  Figure  4.  Since  w'  is  a  convex  angle,  we  form  a  short  spiral  from  p  to  w 
that  includes  edge  {w,w')  on  the  outer  chain.  Since  the  cut  is  tangent  to  the  inner  chain  at 
w,  we  have  a  spiral  from  w'  to  q  with  edge  {w,w')  on  the  inner  chjiin.  We  continue  walking 
and  cutting  until  only  short  spirals  remain. 

Any  line  that  intersects  an  added  edge  does  so  once  as  an  inner  and  once  as  an  outer 
edge.  Thus,  the  added  edges  do  not  contribute  to  the  difference  between  the  number  of 
intersections  of  a  line  with  the  inner  and  outer  chains.  Since  the  conclusion  holds  for  each 
of  the  short  spirals,  it  holds  for  the  original  spiral.       | 

Before  we  show  how  to  find  the  graph  of  adjacent  shortest  paths,  we  bound  its  size.  We  prove 
the  following  lemma. 

Lemma  3.13  Let  G  be  the  multigraph  obtained  by  replacing  each  Steiner  path  Stp{p,q)  between  a 
pair  of  adjacent  real  vertices,  p  and  q,  with  the  shortest  path  avoiding  the  tree,  sp{p,  q).  The  number 
of  (straight)  edges  in  G  is  0{n). 

Proof:  We  first  show  that  G  is  planar.  Let  e  be  any  edge  on  the  inner  chain  of  a  spiral.  If 
we  extend  e  in  both  directions  until  it  hits  the  spiral  boundary,  it  will  hit  the  outer  chain 
of  the  spiral  (because  all  vertices  on  the  inner  chain  are  reflex).  See  Figure  5.  Any  edge  e' 
that  crosses  e  without  crossing  the  outer  chain  must  end  in  the  shaded  region  of  Figure  5; 
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Figure  5:  No  two  shortest  path  edges  cross. 

in  particular,  since  the  only  real  vertices  on  the  shaded  region  are  the  endpoints  of  e,  at 
least  one  endpoint  of  e'  is  not  a  real  vertex.  Because  the  edges  of  a  shortest  path  end  at  real 
vertices,  no  shortest  path  edge  can  cross  e.  Thus  no  two  edges  of  G  can  cross:  G  is  planar. 

Next  we  must  show  that  any  one  edge  is  not  used  in  too  many  shortest  paths.  Let  e 
be  an  oriented  edge  of  G.  Edge  e  belongs  to  some  spiral,  which  lies  either  to  the  left  or 
to  the  right  of  e.  Because  the  interior  of  a  spiral  is  empty,  if  we  extend  the  perpendicular 
bisector  of  e  into  the  spiral,  we  must  hit  the  outer  chain  of  the  spiral.  This  first  intersection 
of  the  bisector  with  the  Steiner  tree  is  unique,  and  it  lies  on  a  unique  Steiner  path,  whose 
endpoints  therefore  define  a  unique  spiral.  Thus  at  most  two  spirals  contain  e,  one  to  its 
left  and  one  to  its  right. 

We  have  shown  that  G  is  a  straight-edge  planar  multigraph  on  n  vertices  in  which  each 
edge  appears  at  most  twice.  Therefore,  G  has  0{n)  edges.       | 

We  now  have  enough  machinery  to  indicate  how  Guibas  and  Hershberger's  shortest  path  algo- 
rithm constructs  the  graph  of  shortest  paths  between  adjacent  vertices,  and  to  analyze  the  amount 
of  time  and  space  required. 

Theorem  3.14  Given  a  Steiner  tree,  in  0(n log  n)  time  and  space  we  can  construct  a  spanning 
tree  of  the  original  points  that  has  no  Steiner  points,  and  whose  stabbing  number  with  respect  to 
any  line  is  at  most  twice  that  of  the  Steiner  tree. 

Proof:  Take  an  Eulerian  tour  r  around  the  Steiner  tree — doubling  the  stabbing  number  of 
each  line — and  identify  the  pairs  of  adjacent  real  vertices  on  the  tour.  Using  the  algorithm 
of  Guibas  and  Hershberger  [GH87],  we  can  find  the  shortest  path  between  any  pair  of  points 
p  and  q  in  time  0(log  n  +  k),  where  k  is  number  of  edges  along  the  path. 

As  a  preprocessing  step  for  finding  shortest  paths,  triangulate  the  bays  defined  by  the 
Steiner  tree  and  its  convex  hull  in  O(nlogn)  time.  The  algorithm  of  Guibas  and  Hershber- 
ger [GH87]  requires  a  linear  amount  of  preprocessing  once  a  triangulation  is  given.  It  finds 
the  shortest  path  sp{p,q)  in  time  0(logn  -|-  k),  where  k  is  the  number  of  edges  on  sp{p,q). 
In  our  application.  Lemma  3.13  proves  that  the  total  number  of  edges  on  all  shortest  paths 
is  0{n)\  thus  the  logarithmic  overhead  for  each  path  determines  the  running  time,  giving 
0(n  log  n)  in  total. 
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Lemmas  3.11  and  3.12  prove  that,  for  any  line,  the  stabbing  number  of  the  graph  of  all 
shortest  paths  is  no  larger  than  that  of  the  Eulerian  tour.  Building  a  spanning  tree  on  this 
graph  completes  the  algorithm.     I 

4     Representing  and  computing  convex  hulls 

This  section  takes  the  O(logn)  spanning  trees  provided  by  Section  3  and  transforms  them  into  a 
data  structure  that  supports  face  queries  on  A{L).  We  describe  a  query  algorithm  that,  given  a 
query  point  p,  returns  an  implicit  representation  of  the  idjce  of  A(L)  that  contains  p.  If  the  dual 
line  of  p  has  s  as  its  minimum  stabbing  number  with  respect  to  any  of  the  spanning  trees,  then 
the  query  takes  A{s)  time.  The  implicit  representation  returned  by  the  query  can  be  used  to  find 
any  of  several  face  properties  (intersections  with  a  line,  extreme  vertices,  the  edge  below  p,  and 
the  number  of  face  edges,  for  example)  in  additional  O(logTi)  time,  or  to  list  the  K  face  edges  in 
additional  0{K  +  log  n)  time. 

Our  algorithm  uses  geometric  duality  to  answer  face  queries.  The  dual  of  the  face  of  A{L)  that 
contains  p  is  an  "hourglass"  shape  in  the  configuration  P,  as  shown  in  Figure  1.  The  "hourglass" 
is  defined  by  the  inner  common  tangents  of  the  two  convex  hulls  h{P  D  t"^)  and  h{P  r\l~).  Our 
algorithm  computes  the  two  hulls,  finds  the  common  tajigents,  and  transforms  the  results  to  answer 
the  primal  query.  Its  basic  operation  is  computing  h{Pr\t*),  for  t'  €  {l'^,l~}. 

We  transform  the  trees  of  Section  3  into  simple  paths,  which  are  easier  to  work  with  than  trees. 
For  a  particular  tree,  we  trace  around  the  boundary  of  the  tree  to  get  an  Eulerian  tour,  then  drop 
an  edge  to  get  a  path.  Every  non-leaf  vertex  of  the  tree  appears  more  than  once  along  the  path. 
However,  in  what  follows  we  will  assume  that  the  path  is  simple — that  overlaps  do  not  occur.  We 
can  simulate  this  condition  during  the  running  of  our  algorithm  using  standard  techniques.  Let  T 
denote  the  simple  polygonal  path  that  we  get  from  the  spanning  tree. 

We  first  preprocess  T  to  get  a  structure  called  a  bridge  tree  that  represents  the  convex  hull  of 
a  simple  polygonal  path  and  supports  several  operations  on  that  hull,  including  finding  tangents. 
Let  the  s  subpaths  of  F  included  in  the  query  halfplane  i'  be  71, ...  ,7a.  We  determine  the  7i's  by 
intersecting  i  with  F  in  0(5  log  n)  time  [CG86].  (Actually,  if  we  use  the  algorithm  of  Theorem  3.5, 
which  builds  O(logn)  trees,  we  have  O(logn)  paths,  and  we  want  to  use  the  one  with  fewest 
intersections  with  £.  To  find  it,  we  run  the  intersection-finding  algorithm  of  [CG86]  in  lock-step 
on  all  the  paths  at  the  same  time  and  stop  when  all  intersections  with  one  of  the  paths  have  been 
found.  This  path  will  be  denoted  F.  This  takes  O(alog^n)  time  altogether.)  We  compute  the 
bridge  tree  for  each  7,,  then  use  these  bridge  trees  to  find  h{fi  U  •  •  •  U  7,). 

We  now  give  an  intuitive  description  of  bridge  trees,  then  formally  list  their  properties.  Imple- 
mentation details  are  omitted  here,  but  will  appear  in  a  forthcoming  paper  on  bridge  trees  [GHS88]. 
A  bridge  tree  ^(7)  represents  the  convex  hull  of  a  simple  polygonal  path  7.  Bridge  trees  are  similar 
to  data  structures  used  by  Overmars  and  van  Leeuwen  [OvL81]  and  Guibas  and  Hershberger  [GH87]. 
Bridge  trees  exploit  the  observation  that  if  paths  7  and  7'  are  disjoint,  then  the  hulls  h{f)  and 
h{Y)  have  at  most  two  common  tangents  [CG86].  (This  is  a  crucial  observation  for  our  algorithm, 
and  is  the  reason  for  the  tree-untangling  step  of  Section  3.4.)  If  7  can  be  split  into  7/  and  7,  by 
erasing  an  edge,  then  h{f)  is  either  identical  to  one  of  /»(7/)  and  h{fr),  or  its  boundary  consists 
of  a  convex  chain  from  h{ji),  a  convex  chain  from  /i(7r),  and  two  outer  common  tangents  joining 
the  chains.  The  bridge  tree  for  7  stores  the  two  tangents  explicitly  and  represents  the  chains  by 
pointers  to  B{ji)  and  B{fT)- 


20 


To  find  the  bridge  tree  for  T,  we  remove  an  edge  from  the  middle  of  T  to  get  the  subpaths 
F)  and  Tr,  then  build  B{T)  from  the  recursively  constructed  B(Ti)  and  B{Tr).  An  arbitrary  sub- 
path  7  of  r  can  be  expressed  as  the  concatenation  of  0(log  n)  of  the  paths  produced  during  the 
construction  of  B{T).  We  use  the  bridge  trees  for  these  paths  to  find  6(7). 

Let  us  state  the  properties  of  bridge  trees  more  form2Llly. 


(1)  We  can  build  a  bridge  tree  B{T)  for  F  in  0{n)  time  and  space. 

(2)  Given  5(F),  we  can  extract  5(7)  for  any  subpath  7  of  F  in  O(log^  n)  time. 
Given  a  bridge  tree  ^(7),  we  can  in  0(log  n)  time 

(3)  Find  the  tangents  to  h{f)  that  pass  through  a  given  point, 

(4)  Find  the  extreme  vertex  of  h{f)  in  a  query  direction, 

(5)  Find  the  intersections  of  h{f)  with  a  line,  and 

(6)  Count  the  number  of  hull  edges  between  two  vertices  of  h{f). 

We  can  also 

(7)  List  the  k  edges  of  /i(7)  between  two  hull  vertices  in  O(log  n  +  k)  time. 
Given  ^(7)  and  B{Y)  for  disjoint  paths  7  and  7',  we  can 

(8)  Find  the  inner  common  tangents  of  h{f)  and  h{Y)  in  O(logn)  time,  if  they  exist, 
and 

(9)  Find  the  outer  common  tangents  of  h{f)  and  h{f')  in  0(log''  n)  time,  if  /i(7)  and 
/i(7')  intersect  in  at  most  two  points  and  we  are  given  vertices  on  h{-f)  and  h{Y) 
that  lie  on  h{f  U  7'). 


Our  representation  of  the  hxill  H  =  /i(7i  U  •  •  •  U  7,)  uses  bridge  trees  as  building  blocks.  The 
hull  H  consists  of  chains  from  the  hulls  of  the  7,'s  alternating  with  tangent  edges  that  link  the 
chains.  Between  the  endpoints  of  a  chain  from  a  single  hull  h{fi),  the  hull  H  is  identical  to  h{fi). 
We  represent  each  h{fi)  by  its  bridge  tree  B{ji)\  we  represent  .?  as  a  circular  list  of  tangent  edges 
alternating  with  pointers  to  these  bridge  trees.  We  call  this  structure  a  necklace:  we  think  of 
the  bridge  trees  as  beads  strung  on  a  loop  of  tangent  edges.  A  necklace  built  from  hulls  of  a 
disjoint  simple  paths  has  size  0{s),  exclusive  of  the  bridge  trees  it  points  to  [KLPS86].  If  we  store 
the  tangent  edges  in  a  binary  search  tree  augmented  with  hull  edge  counts,  the  resulting  hybrid 
structure  supports  operations  (3)-{9)  listed  above. 

We  use  B{T)  to  produce  B{'n),...,B{'),)  in  Oislog^  n)  time.  In  the  course  of  calculating 
/i(7i  U  •  •  •  U  7,),  we  need  to  use  operation  (9)  to  find  the  tangents  of  certain  pairs  of  hulls.  The 
following  lemma  shows  how  to  find  the  special  vertices  that  operation  (9)  requires. 

Lemma  4.1  Let  7  and  7'  be  disjoint  simple  polygonal  paths  whose  convex  hulls  are  represented  by 
B{-j)  and  B{t')-  U  ^^  <""€  given  a  ray  r  that  originates  at  a  vertex  of -(  and  does  not  intersect  7', 
we  can  determine  whether  7  lies  inside  the  hull  of  Y  in  0{\ogn)  time.  If  f  is  not  contained  in 
h(Y),  we  can  in  O(logn)  time  find  a  vertex  off  that  lies  on  h{f  U  7'). 
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Proof:  If  7  lies  inside  h{f'),  it  must  lie  in  some  bay  bounded  by  7'  and  an  edge  e  of  h{Y). 
In  this  case  r  must  intersect  e.  If  7  is  not  fully  contained  in  h{Y),  r  may  or  may  not  intersect 
the  hull  of  7'.  If  r  intersects  ^(7'))  let  e  be  the  huU  edge  it  cuts.  An  edge  of  7  must  cross  e 
to  get  out  of  h{'f'),  and  hence  some  vertex  of  7  lies  on  the  opposite  side  of  e  from  7'.  Thus 
7  is  contained  in  h{f')  if  and  only  if  r  intersects  an  edge  e  of  h(j')  and  h('()  lies  on  the 
same  side  of  e  as  7'.  Since  bridge  trees  allow  each  of  the  operations  necessary  to  test  this 
condition  to  be  performed  in  O(logn)  time,  the  first  claim  follows. 

If  7  is  not  fully  contained  in  h{Y),  we  want  to  find  a  vertex  of  7  on  hifUf')-  If ''  intersects 
/i(7'),  let  €  be  the  edge  it  cuts;  if  r  does  not  intersect  h{f'),  define  e  to  be  a  tangent  edge 
from  the  origin  of  r  to  h{Y).  In  both  cases,  the  vertex  of  h{j)  where  the  tangent  line  is 
parallel  to  e  and  where  7  and  7'  lie  on  the  same  side  of  the  tamgent  is  on  /i(7  U  7').  Using 
operations  (3)-(5),  we  can  find  this  vertex  in  O(logn)  time.       | 

This  lemma  lets  us  find  the  common  tangents  (if  any)  of  h(fi)  and  /»(7j)  for  any  t  and  j.  The 
edges  of  T  cut  by  our  dual  query  line  i  provide  the  rays  that  the  lemma  requires:  At  least  one 
endpoint  of  7,  belongs  to  an  edge  of  F  cut  by  I.  The  edge  can  be  extended  to  infinity  on  the  other 
side  of  ^  without  hitting  7j,  and  hence  can  be  used  in  Lemma  4.1.  Using  operation  (9),  we  can  find 
the  desired  common  tangents,  if  they  exist.  We  use  this  fact  in  the  proof  of  the  main  result  of  this 
section: 

Theorem  4.2  Given  an  untangled  spanning  tree  T  of  the  n  points  in  P,  we  can  preprocess  it  in 
linear  time  to  answer  halfplane  convex  hull  queries.  Given  a  query  halfplane  t' ,  we  can  compute  a 
necklace  representation  of  h{P  n£')  in  O(Jlog^n)  time,  where  s  is  the  stabbing  number  of  £  with 
respect  to  T. 

Proof:  The  preprocessing  phase  computes  a  path  T  from  the  given  spanning  tree,  then 
builds  the  bridge  tree  B{T).  At  query  time  we  find  the  subpaths  71, . . .  ,7,  in  ^  flF  [CG86], 
then  build  ^(71),. . .  ,5(7,)  in  0(Jlog^  n)  time.  (Note  that  s  is  between  2s  -2  and  25.)  We 
combine  h{i\),. . .  ,h{'j,)  into  a  single  hull  using  a  divide-and-conquer  approach. 

We  find  the  necklace  for  /i(7i  U  ■  •  -07,)  by  first  partitioning  the  set  {71, . . .  ,7*}  into  two 
subsets  of  equal  or  nearly  equal  size,  next  finding  the  necklace  for  each  subset  separately, 
and  finally  merging  the  two  necklaces  into  a  single  necklace.  We  must  describe  the  merging 
step. 

We  split  each  necklace  at  its  leftmost  and  rightmost  vertices,  then  merge  upper  and 
lower  necklaces  separately.  We  merge  two  upper  necklaces  using  a  left-to-right  sweep.  For 
each  upper  necklace,  the  endpoints  of  the  tangent  edges  specify  a  sequence  of  i-coordinates. 
We  merge  the  sequences  for  the  two  necklaces  to  get  a  single  sequence  with  k  elements 
Xi,X2,--  ■,Xk,  for  some  k  =  0{s).  See  Figure  6.  In  any  interval  (x,, i^+i),  each  necklace 
has  just  one  element,  either  a  piece  of  a  subpath  hull  or  a  tangent  edge.  We  scan  through 
these  intervals  from  left  to  right.  In  each  interval  we  find  the  joint  upper  hull  of  the  two 
elements  in  C'(log^n)  time  using  Lemma  4.1  and  operations  (3),  (5),  and  (9).  We  find  the 
complete  upper  hull  of  the  original  two  necklaces  using  a  Graham  scan  [PS85]  through  the 
intervals,  at  each  step  finding  the  common  tangent  from  the  hull  in  the  current  interval  to 
the  convex  hull  to  its  left.  Operation  (9)  is  applied  0{k)  times  during  the  scan,  and  thus  the 
scan  takes  0(^log^  n)  time.  The  resulting  hull  is  a  list  of  subpaths  and  common  tangents 
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between  them,  but  the  hull  may  be  split  unnecessarily  at  interval  endpoints.  In  0{k)  time 
we  can  merge  adjacent  huU  pieces  that  come  from  the  same  subpath  or  tangent  edge.  This 
gives  us  the  merged  hull  in  necklax;e  form,  as  desired. 


Figure  6:  Intervals  arising  from  breaking  necklaces  at  tangents 

Since  merging  two  necklaces  defined  by  a  total  of  k  subpaths  takes  0(k\og^  n)  time, 
finding  the  necklace  of  all  5  subpaths  by  divide-and-conquer  takes  O(slog'^n)  time.       | 

Our  divide-and-conquer  algorithm  to  find  the  convex  hull  may  seem  like  overkill:  one  might 
expect  to  be  able  to  find  the  hull  by  scanning  through  7i,...,7j  once,  in  the  order  given  by 
the  intersections  of  T  with  £.  However,  this  simple  approach  is  too  naive.  One  subpath  7,  can 
appear  on  the  necklace  many  times.  Furthermore,  the  order  of  subpath  appearances  along  the 
necklace  is  not  necessarily  the  same  as  the  order  given  by  the  intersections  of  T  with  the  halfplane 
boundary.  However,  it  is  possible  that  some  method  could  use  the  intersection  order  to  speed  up 
the  construction. 

5     Implicit  representation  of  line  arrangements 

In  this  section  we  summarize  how  the  techniques  of  the  preceding  sections  can  be  used  in  combi- 
nation to  represent  an  arrangement  of  lines  implicitly. 

The  first  step  is  to  dualize  the  set  of  n  lines  L  to  get  a  set  of  points  P.  A  query  point  p  then 
maps  to  a  line  I,  and  the  face  of  A{L)  that  contains  p  maps  to  an  hourglass  shape  determined  by 
the  inner  common  tangents  of  h{P  n  t"^)  and  h{P  n  /"),  as  shown  in  Figure  1. 

We  preprocess  P  for  finding  an  implicit  representation  of  h{P  D  I')  for  any  hzJfplane  /*  G 
{t'^,l~}.  We  use  Theorem  3.5  to  build  a  collection  of  O(logn)  spanning  trees  of  P  with  the 
property  that,  with  high  probability,  any  line  t  has  stabbing  number  A{y/n)  with  respect  to  at  least 
one  of  them.  The  construction  uses  random  sampling  in  two  places.  One  sample,  taken  from  the 
edges  of  a  spanning  tree,  is  easy  to  pick,  but  the  second,  taken  from  an  implicitly  defined  "bad"  set 
of  O(ti^)  lines,  requires  A(n^/^)  time  to  compute  using  Theorem  3.8.  (Notice,  but  the  way,  that  this 
is  the  only  step  of  the  algorithm  that  takes  more  than  A(n)  time.)  The  spanning  trees  may  have 
self-intersections;  we  remove  these  using  the  algorithm  of  Section  3.4.  Finally,  we  build  a  bridge 
tree  for  the  Eulerian  path  defined  by  each  spanning  tree,  as  specified  in  Theorem  4.2. 

Given  a  query  point  p,  we  dualize  it  to  get  query  line  I,  then  compute  necklaces  for  h{P  D  /"•■) 
and  h{Pnt~)  based  on  the  path  F  that  intersects  I  the  fewest  times.  The  necklaces  let  us  find  the 
inner  common  tangents  of  the  two  huUs  in  O(logn)  time.  The  endpoints  of  the  tangents  delimit  a 


23 


convex  chain  of  each  neckltice  on  the  side  closer  to  t.  These  chains  dualize  to  the  upper  and  lower 
hulls  of  the  face  containing  p.  We  can  use  the  necklace  search  structures  to  calculate  properties 
of  the  face,  simply  interpreting  the  dual  edges  and  vertices  as  primal  vertices  and  edges.  Thus  we 
have  established  the  following  theorem: 

Theorem  5.1  Given  a  set  L  of  n  lines,  we  can  preprocess  it  in  A(n^/^)  randomized  time  and  A(n) 
space  so  that,  with  high  probability,  for  any  query  point  p  we  can  in  A.{y/n)  time  produce  an  implicit 
representation  of  the  face  of  A{L)  that  contains  p.  From  the  implicit  representation  we  can  (1)  find 
the  intersections  of  a  line  with  the  face,  (2)  find  the  extreme  vertex  of  the  face  in  any  direction,  (S) 
count  the  edges  of  the  face,  or  (4)  report  the  edges  of  the  face.  All  the  operations  except  (4)  take 
0(log  n)  time;  reporting  the  edges  of  the  face  takes  O(log  n  +  K)  if  there  are  K  edges. 

The  theorem  is  stated  using  A- notation;  the  actual  bounds  are  C)(n^/^  log' n)  preprocessing 
time,  O(nlogn)  space,  and  0{^/n\og^  n)  query  time. 

Note  that  operation  (1)  of  the  theorem  lets  us  determine  whether  two  query  points  p  and  q  lie  in 
the  same  face.  The  line  determined  by  p  and  q  has  the  same  intersections  with  the  face  containing 
p  and  the  face  containing  q  if  and  only  if  p  and  q  lie  in  the  same  face. 

6     A  trade-off  between  storage  and  query  time 

In  this  section  we  present  a  technique  that  allows  us  to  improve  the  query  time  for  reporting  a 
face  in  an  arrangement  of  lines  at  the  expense  of  more  storage.  The  technique  uses  a  random 
sampling  step  that  is  akin  to  the  one  used  in  [CEG*88]  for  obtaining  a  bound  on  the  combinatorial 
complexity  of  several  cells  in  a  line  arrangement.  The  trade-off  between  storage  and  query  time  is 
controlled  by  a  parameter  r,  1  <  r  <  n.  If  we  increase  storage  to  A(rn)  and  preprocessing  time 
to  A(r^/'n"'/'),  then  we  can  decrease  query  time  to  A{y/n/r).  This  trade-off  carries  all  the  way  to 
r  ss  n,  where  it  corresponds  to  storing  essentially  the  fuU  arrangement  of  the  n  lines  and  doing  a 
point-location  in  it.  As  we  will  see  in  the  next  section,  another  interesting  point  in  the  trade-off 
occurs  when  r  =  n^'^,  in  which  czise  the  query  time  becomes  A(n^'^). 

We  should  also  remark  here  that  since  a  random  sampling  step  is  involved,  the  more  precise 
statement  of  the  trade-off  result  is  that  we  provide  a  search  structure  that  (1)  always  works,  (2) 
takes  space  A(rn),  and  (3)  with  high  probability  takes  preprocessing  time  A(r^/'n''/')  and  can 
answer  any  feice  query  in  time  A{\/n/r). 

Our  technique  proceeds  as  follows.  We  select  at  random  a  sample  of  size  r  among  the  given  lines 
and  triangulate  all  its  faces  by  drawing  auxiliary  edges,  if  necessary.  The  e-net  theory  of  Haussler 
and  Welzl  [HW87]  (or  the  probabilistic  lemma  of  Clarkson  [Cla87])  then  guarantees  that  with  high 
probability  ezxh  of  the  O(r^)  triangles  thus  created  is  intersected  by  at  most  C)(n  log  r/r)  of  the 
arrangement  lines.  This  is  only  slightly  more  than  0(n/r),  the  expected  number  of  lines  that  cut 
through  a  triamgle.  See  [CEG*88]  for  more  details  on  such  arguments.  Now  the  idea  of  the  trade-off 
technique  is  to  treat  all  the  lines  cutting  through  a  triangle  as  a  subproblem  to  be  solved  by  the 
methods  of  Section  4. 

Consider  a  specific  triangle  T  and  the  collection  of  all  original  lines  cutting  it  (strictly  speaking, 
cutting  its  interior).  For  a  query  point  p,  if  p  lies  in  T,  then  we  want  the  subproblem  for  T  to  return 
to  us  the  face  containing  p.  However,  the  face  containing  p  in  the  subproblem  might  not  be  identical 
to  the  face  containing  p  in  the  full  arrangement.  The  reason  is  that  triangles  bordering  T  can  also 
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Figure  7:  The  face  contaialng  p  crosses  a  triangle 

contribute  sides  to  that  face,  as  Figure  7  shows"*.  These  problematic  faces  are  exactly  those  that 
touch  (share  an  edge  with)  a  sample  line,  or  are  crossed  by  an  auxiliary  edge  of  the  triangulation. 
To  deal  with  these  faces,  we  just  precompute  and  store  them  all  in  an  explicit  subdivision  5  of  the 
plane.  We  then  preprocess  5  for  efficient  point  location  by  one  of  the  stztndard  techniques.  Now, 
when  a  query  point  p  is  given,  we  first  do  a  point  location  for  p  in  5.  K  p  falls  into  one  of  the 
convex  faces  of  S  corresponding  to  faces  of  the  original  arrangement  of  the  n  lines,  then  we  report 
that  face  and  we  are  done.  Otherwise  p  must  lie  in  a  face  of  S  that  is  fully  covered  by  a  unique 
triangle  T  of  the  triangulation.  Furthermore,  in  that  case  the  face  of  p  in  the  subproblem  is  also 
the  face  of  p  in  the  full  arrangement.  So  by  an  application  of  the  methods  of  Section  4  we  can 
complete  the  computation. 

What  is  the  cost  of  all  this?  The  cost  of  choosing  the  random  sample  of  the  lines  and  triangu- 
lating the  faces  of  their  arrangement  is  O(r^).  It  is  known  [CGL85,  EOS86]  that  in  an  arrangement 
of  n  lines  the  combinatorial  complexity  of  all  faces  touching  a  particular  line  (the  horizon  or  zone 
of  the  line  [Ede87])  is  0{n).  The  results  of  [EGP*88,  EGS88]  imply  that  these  faces  can  actually 
be  calculated  in  time  A(n).  Thus  the  faces  touching  all  the  sample  lines  can  be  computed  in  time 
A(rn)  cind  stored  in  space  0{rn).  The  faces  crossed  by  the  auxiliary  edges  in  the  triangulation  can 
be  obtained  in  a  similar  fashion,  by  combining  the  horizons  corresponding  to  the  lines  supporting 
these  edges  in  each  of  the  relevant  subproblem  arrangements.  Since  there  are  only  0(n  log  r/r)  lines 
in  each  triangle  subproblem,  a  total  of  O(r^)  trijmgles,  and  at  most  two  auxiliary  edges  bordering 
each  triangle,  we  conclude  that  the  horizons  of  all  auxiliary  edges  can  be  computed  in  A(rn)  time 
and  stored  in  A(rn)  space'*.  Finally,  since  the  point  location  structure  takes  linear  space  in  the  size 
of  the  underlying  subdivision  5,  the  overall  space  that  we  need  is  A(rn)  for  the  that  structure  and 
also  A(rn)  in  total  for  the  triangle  subproblems. 

Theorem  6.1  For  each  r,  1  <  r  <  n,  we  can  build  a  structure  for  the  face  reporting  problem  that 
takes  A(rn)  space  and  which,  with  high  probability,  can  be  built  in  time  A{r^'^n^'^)  and  allows  us 
to  answer  any  face  query  in  time  A(i/n/r)  +  0{K),  where  K  is  the  size  of  the  reported  face. 


^In  fact  the  face  containing  p  in  the  full  arrangement  can  cut  across  many  such  triangles. 

*In  this  description  we  have  skipped  over  a  number  of  straightforwiird  details  that  are  left  to  the  reader. 
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7     Implicit  faces  in  arrangements  of  segments 

In  this  section  we  will  extend  our  techniques  to  collections  of  line  segments,  as  opposed  to  infinite 
lines.  In  this  situation  many  new  difficulties  arise:  the  faces  we  may  wish  to  report  need  not 
be  convex,  or  even  simply  connected — see  Figure  8.  Our  previous  implicit  techniques  cannot  be 
extended  easily,  as  we  have  made  very  heavy  use  of  convexity.  Instead,  our  approach  will  use  the 
method  of  partitioning  into  "triangular"  subproblems,  as  developed  in  the  previous  section.  The 
reason  partitioning  helps  is  that  in  a  "small"  triangle  T  we  can  hope  that  most  segments  intersecting 
T  actually  go  all  the  way  through  T.  Since  we  are  interested  only  in  faces  clipped  to  within  T, 
when  we  solve  the  subproblem  associated  with  T  we  can  treat  these  long  segments  as  infinite  lines. 
A  similar  idea  has  been  used  in  [GOS87]  to  solve  the  problem  of  counting  or  reporting  line  segment 
intersections. 


Figure  8:  A  face  in  an  arrangement  of  segments  that  is  neither  convex  nor  simply  connected 

Let  L  denote  the  set  of  lines  supporting  our  n  given  segments,  and  G  denote  the  set  of  endpoints 
of  these  segments.  As  in  the  previous  section,  we  wiU  now  use  a  random  sampling  technique  to 
partition  our  problem  into  subproblems.  Each  subproblem  will  be  associated  with  a  triangle  T  and 
will  consist  of  all  the  original  segments  cutting  through  T.  Let  r  be  an  integer  parameter  to  be 
fixed  later.  We  choose  a  random  saimple  L'  of  r  of  the  lines  in  L  and,  independently,  a  random 
sample  G'  of  r^  of  the  points  in  G.  We  now  triangulate  the  arrangement  of  the  lines  in  L'  with 
the  additional  points  of  G"  thrown  in — see  Figure  9.  By  using  O(r^)  triangles  we  can  obtain  a 
triangulation  such  that  no  triangle  contains  one  of  the  points  in  G'  in  its  interior,  or  has  its  interior 
intersected  by  one  of  the  lines  in  L'.  Now  the  e-net  theory  [HW87,  Cla87]  allows  us  to  conclude 
that,  with  high  probability,  no  triangle  is  intersected  by  more  than  O(nlogr/r)  of  the  lines  in  L, 
and  no  triangle  contains  more  than  0(n log r/r^)  of  the  points  in  G.  Notice  that  in  this  step  the 
points  in  G  are  partitioned  among  the  triangles,  while  each  line  in  L  may  be  distributed  to  many 
triangles.  Thus,  on  the  average,  this  partition  step  will  decrease  the  number  of  points  in  a  triangle 
more  than  it  will  decrease  the  number  of  lines. 

We  now  construct  and  store  aU  faces  of  the  original  arrangement  of  the  n  segments  that  are 
either  bordered  by  one  of  the  r  sample  lines  or  crossed  by  one  of  the  auxiliary  O(r^)  triangulation 
edges.  These  faces  belong  either  to  the  r  horizons  or  zones  [Ede87]  of  the  sample  lines  in  our 
arrangement  of  n  segments,  or  to  the  O(r^)  horizons  of  the  auxiliary  edges  in  arrangements  of 
0(nlogr/r)  segments  each.  The  results  of  [EGP*88,  EGS88]  can  be  used  to  prove  that  the  total 
complexity  of  these  faces  is  only  A(rn),  in  that  they  can  all  be  calculated  in  A(rn)  time.  All  these 
faces  together  define  a  subdivision  of  the  plane,  which  we  preprocess  for  efficient  point  location. 
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Figure  9:    We  triangulate  the  arrangement  of  sampled  lines  A{L')  together  with  the  sampled 
points  G' 

The  overall  time  and  storage  cost  of  that  step  is  A(rn). 

Our  query  algorithm  then  begins  by  doing  a  point  location  in  the  above  structure.  If  the  query 
point  p  happens  to  lie  in  one  of  the  explicitly  stored  faces,  then  we  can  complete  processing  the 
query  without  having  to  use  any  additional  structures.  Otherwise  p  falls  into  a  region  that  is  fuUy 
covered  by  a  unique  triangle  T  of  the  triangulation  Jtnd  has  the  property  that  the  face  containing  p 
in  the  subproblem  corresponding  to  T  is  exactly  the  face  of  p  in  the  fuU  arrangement. 

How  do  we  solve  these  triangle  subproblems?  Let  us  concentrate  on  a  particular  triangle  T. 
The  set  of  lines  Lj  consists  of  the  lines  supporting  the  segments  cutting  T,  and  the  set  of  points 
Gt  consists  of  the  endpoints  of  these  segments  that  actually  lie  in  T.  It  is  useful  to  categorize  the 
segments  crossing  T  into  two  groups:  the  "long"  segments  that  cut  all  the  way  through  T,  and  the 
"short"  segments  that  have  an  endpoint  in  T.  Notice  that  there  cam  be  no  more  short  segments  in  T 
than  the  number  of  points  in  Gt-  If  we  choose  r  «  n^/^,  then  the  €-net  bounds  imply  that  that  the 
number  of  lines  in  Lj  (and  therefore  the  number  of  long  segments)  is  at  most  0(Ti^/^logn),  while 
the  number  of  points  in  Gt  (and  therefore  the  number  of  short  segments)  is  at  most  0{n^^^  log  n). 
The  fact  that  the  bound  on  the  number  of  short  segments  is  only  slightly  greater  than  the  square 
root  of  the  bound  on  the  number  of  long  segments  is  favorable  for  us,  as  we  will  now  see.  This 
consideration  determined  the  choice  of  r  made  above. 

For  the  triangle  T  we  store  two  separate  structures:  the  full  arrangement  of  the  short  segments 
of  r,  and  the  data  structure  of  Theorem  5.1  for  the  duals  of  the  infinite  lines  supporting  the  long 
segments  of  T.  Because  of  our  choice  of  r,  the  total  size  of  these  two  structures  is  only  A(n'/"'). 
Summed  over  all  0{r^)  triangles,  the  total  is  A(n^/'').  In  combination  with  the  point-location 
structure  described  above,  the  overall  storage  that  we  use  is  A(ti*'"');  the  preprocessing  cost  is,  with 
high  probability,  A(ti^/^). 

To  answer  a  query  in  T,  we  separately  solve  it  in  the  (explicit)  arrangement  of  the  short  segments 
in  T,  and  in  the  (implicit)  arrangement  of  the  long  segments  in  T;  then  we  combine  the  results.  It 
is  easy  to  see  how  this  works  for  the  types  of  queries  we  have  been  concerned  with,  except  for  face 
reporting,  where  we  encounter  a  technical  difficulty:  the  two  subfaces  obtained  in  the  two  separate 
arrangements  can  have  much  larger  size  than  the  desired  face,  which  is  a  connected  component  of 
their  intersection. 

There  is,  however,  a  lot  of  structure  in  the  situation  that  we  can  exploit.  Recall  that  what  we 
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desire  is  to  answer  the  query  at  a  cost  which  is  A(n^'^)  +  0{K),  where  K  is  the  output  size.  By 
the  results  in  [EGP*88],  the  "short"  face  has  size  only  A(n^/^),  so  we  can  afford  to  look  at  all  of  it. 
Notice  also  that  the  implicitly  represented  "long"  face  is  convex. 

To  deal  with  the  above  difficulty,  we  can  start  out  by  triangulating  the  short  face.  Then 
we  intersect  separately  each  of  these  triangles  with  the  long  face.  A  triangle  can  be  intersected 
efficiently  with  the  long  face:  we  just  intersect  the  long  face  with  each  of  the  lines  defining  the 
triangle  and  collect  the  appropriate  pieces.  By  the  results  of  Section  4,  we  can  construct  an 
implicit  representation  of  the  portion  of  the  long  face  inside  each  triangle  in  A(l)  time  per  triangle. 
We  can  now  put  all  the  separate  intersections  together  into  the  true  intersection  of  the  long  and 
short  faces.  Notice  that  the  additional  breakage  introduced  along  the  long  face  by  the  edges  used 
for  triang\ilating  the  short  face  is  only  A{n^^^)  (two  breakpoints  per  diagonal),  so  it  can  be  paid 
out  of  the  allowable  overhead.  This  computation  might  actually  produce  several  intersection  faces, 
not  just  the  one  containing  the  query  point  p.  The  others,  however,  can  be  easily  disposed  of  if 
we  keep  around  the  dual  graph  of  the  triangulation  of  the  short  face.  We  can  start  at  the  node  t; 
of  this  graph  corresponding  to  the  query  point  p  and  in  a  depth-first  traversal  from  v  collect  all 
the  pieces  corresponding  to  the  final  face.  Once  we  have  collected  these  pieces,  we  expand  their 
implicit  representations  to  get  an  explicit  description  of  the  desired  component  of  the  intersection 
of  the  long  and  short  faces.  The  expansion  takes  A(n^/^)  +  0{K)  time,  where  K  is  the  size  of  the 
final  face.  This  completes  our  argument. 

Theorem  7.1  Given  n  segments  in  the  plane,  then  in  A{n*'^)  space  we  can  construct  a  data 
structure  such  that,  with  high  probability,  the  preprocessing  cost  is  A{n^^^),  and,  for  every  query 
point  p,  the  face  of  the  arrangement  of  these  segments  containing  p  can  be  reported  in  time  A{n^^^)  + 
0{K),  where  K  is  the  size  of  the  desired  face. 

8     Reporting  many  faces 

From  the  results  of  [CEG*88]  it  is  known  that  the  complexity  of  m  distinct  faces  in  an  arrangement 
of  n  lines  in  the  plane  is  0{m}l'^n^^^  +  n).  The  method  used  in  that  paper  is  based  on  a  random 
sampling  step  like  the  one  presented  in  the  previous  section.  If  we  think  of  the  m  desired  faces 
as  being  specified  by  m  given  points,  then  the  random  sampling  step  selects  r  lines  and  r^  points 
and  triangulates  the  arr«ingement  of  these  lines  and  points.  Again  by  the  (-net  property,  each 
triangle  subproblem  has  no  more  than  O(nlogr/r)  lines  and  no  more  than  0(m  log  r/r^)  points. 
The  proper  value  of  r  to  use  here  is  r  =  vri^l^ Jn^l^  because  that  makes  the  bound  on  the  number 
of  points  per  triangle  A{n'^l^  jm}!^),  which,  just  as  before,  is  roughly  the  square  root  of  the  bound 
on  the  number  of  lines  per  triajigle,  A{n*l^  fm?!^). 

Suppose  now  that  we  use  our  face  reporting  technique  individually  for  each  point  in  each 
triangle  subproblem.  However,  since  in  each  face  we  already  know  the  queries  we  will  make,  we 
can  use  the  technique  of  Theorem  3.4  to  build  a  single  tree  that  is  good  for  the  pre-specified 
queries  on  the  average.  Thus  we  can  avoid  the  extra  cost  associated  with  Theorem  3.8;  with  high 
probability,  it  takes  only  A(n'*/^/m^/-')  time  to  construct  a  structure  that  allows  face  reporting 
with  A{n^^^/m^'^)  overhead  on  the  average  for  each  face  query.  The  overall  cost  per  subproblem 
is  A{n*l^ fm^/^)  +  0{K).  where  K  is  the  total  size  of  the  faces  to  be  reported.  If  we  sum  this 
over  all  triangles  and  in  addition  take  into  consideration  the  cost  of  the  point  location  structure 
implied  by  the  partitioning  step,  we  obtain  the  following  result  on  the  cost  of  reporting  m  faces  in 
an  arrangement  of  n  lines. 
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Theorem  8.1  Given  m  points  lying  in  distinct  faces  of  an  arrangement  of  n  lines,  then,  with  high 
probability,  in  time  A{m'^/^n'^/^  +  n)  and  space  0{m^l^n'^l^  +  n),  we  can  report  the  faces  containing 
these  points. 

This  result  is  within  a  polylogarithmic  factor  of  the  worst-case  output  complexity  [CEG*88]. 
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